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The  two-sided  Lanczos  algorithm  is  known  to  suffer  Instability  in  the 
form  of  serious  breakdown.  This  occurs  when  the  associate  moment 
matrix  does  not  permit  a  triangular  factorization.  This  work  uses  the 
notion  of  a  generalized  pivot  to  inexpensively  circumvent  the  break 
down  in  most  cases,  with  the  2x2  pivot  examined  In  detail.  The  case 
where  the  generalized  pivot  is  of  no  avail  is  analyzed,  introducing  a 
surprising  characterization  for  that  form  of  serious  breakdown,  c 
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Introduction 


In  1950,  Lanczos  introduced  his  method  for  computing  eigenvalues 
and  eigenvectors  of  nxn  matrices.  His  method  soon  came  to  be  regarded 
as  transforming  a  general  matrix  to  tridiagonal  form.  Unfortunately, 
modifications  involving  considerable  extra  work  were  required  to  main¬ 
tain  accuracy.  The  Lanczos  process  lost  favor  when  the  more  stable 
Givens  (1954)  and  Householder  (1958)  methods  were  introduced. 

As  if  to  seal  the  fate  of  the  Lanczos  process  for  non-symnetric 
matrices  (we  call  it  the  two-sided  Lanczos  algorithm),  Wilkinson 
produced  an  example  which  demonstrates  the  instability  of  the  algorithm 
even  with  infinite  precision  arithmetic  (Wilkinson  [1958]). 

Recently  the  symmetric  Lanczos  has  returned  as  a  viable  method  for 
finding  some  eigenvalues  and  eigenvectors  of  large  symmetric  matrices. 
With  the  current  interest  in  handling  large  problems,  the  non-symnetric 
Lanczos  process  is  ripe  for  reconsideration. 

Chapter  I  presents  the  classical  two-sided  Lanczos  process.  The 
material  is  not  new,  and  is  presented  In  an  informal  manner  so  as  to 
provide  a  background  and  establish  notation.  Much  of  what  could  be 
presented  as  formal  theorems  Is  merely  noted  in  passing  and  left 
without  proof. 

The  key  to  our  work,  which  Chapter  I  emphasizes,  is  the  importance 
of  certain  underlying  Krylov  subspaces  and  the  relative  unimportance  of 
the  resulting  tridiagonal  matrix.  The  weakness  of  the  two-sided 
Lanczos  process  lies  in  Its  inflexibility  in  specifying  the  bases 
vectors  in  the  sequence  of  subspaces  and  our  remedy  relaxes  the  Lanczos 
requirements,  but  by  as  little  as  possible. 
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Chapter  II  introduces  the  general  look-ahead  algorithm  from  two 
aspects  taken  up  in  Chapter  I,  the  two-sided  Gram-Schmidt  process  and 
the  LDU  factorization  of  the  moment  matrix  generated  by  nxn  B  and 
starting  p*  and  q.  From  these  two  perspectives  we  generalize  the 
notion  of  pivot  to  make  the  Lanczos  process  more  flexible  without  much 
extra  work.  There  are  many  factors  to  consider  in  selecting  the 
appropriate  look-ahead  and  Chapter  II  explores  two  important  points. 

Chapter  III  continues  the  discussion  of  the  look-ahead  procedure, 
but  restricts  the  generalization  of  the  pivot  to  the  2x2  case  alone. 
The  relationships  of  classical  factorizations  to  the  look-ahead 
procedure  are  shown  as  well  as  those  of  some  less  obvious  factoriza¬ 
tions.  Finally,  though  somewhat  out  of  place,  the  Kahan,  Parlett  and 
Jiang  (KPJ  [1981])  residual  bounds  are  generalized  to  handle  the  2x2 
case. 

From  Chapters  II  and  III  we  become  aware  of  two  forms  of  what 
Wilkinson  calls  "serious  breakdown".  One  form  we  call  "curable",  and 
it  Is  for  this  case  that  the  look-ahead  Lanczos  algorithm  is  designed. 

The  other  form  of  serious  breakdown  we  call  "incurable".  For 
this  type  of  breakdown  no  simple  procedure  is  available.  However,  In 
Chapter  IV  we  exhibit  a  surprising  characterization  of  Incurable  break¬ 
down  (the  mismatch  theorem)  shows  that  this  rare  occurrence  is  only 
slightly  less  fortunate  than  encountering  an  invariant  subspace. 

Notation 

Throughout  this  work  we  will  use  B  to  denote  the  real  nxn 
matrix  given  to  the  algorithm,  and  suppose  that  each  eigenvalue  of  B 
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is  associated  with  only  one  Jordan  block.  In  general,  upper  case 
Roman  letters  will  denote  matrices,  and  lower  case  Roman  letters  will 
denote  vectors  (though  a  and  b  will  denote  scalars  and  i,  j,  k.  A, 
m  and  n  are  reserved  as  integers).  Upper  case  Greek  letters  are 
used  for  special  matrices  (usually  diagonal),  lower  case  Greek  letters 
are  scalars.  Script  letters  are  spaces. 

Square  brackets  ([  ])  indicate  a  matrix  so  that  [q^,...,q  ]  is 
a  matrix  with  columns  q..  The  matrix,  Ik,  is  reserved  as  the  k*k 
identity  matrix.  N(A)  denotes  the  column  nullspace  of  the  matrix  A. 
The  norms  1*1  and  Mp  are  the  Euclidean  and  Frobenius  norms, 
respectively.  Conjugate  transpose  is  denoted  by  *  (eg.  A*)  with  -* 
denoting  conjugate  transpose  of  the  inverse. 
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I.  The  Two-Sided  Lanczos  Algorithm 

1.1  Introduction 

In  this  chapter  we  describe  the  Lanczos  algorithm  as  applied  to 
a  nonsymmetric  n*n  matrix  B.  In  fact,  we  shall  describe  it  in  three 
different  ways:  (i)  the  Gram-Schmidt  process  applied  to  Krylov 
sequences,  (ii)  the  three  term  recurrence  relation  and  (iii)  the  trian¬ 
gular  factorization  of  the  moment  matrix.  None  of  these  viewpoints  is 
new,  but  each  is  relevant  to  the  modification  of  the  Lanczos  algorithm 
that  is  the  focus  of  this  work.  Moreover,  these  sections  establish 
our  notation. 

In  the  course  of  establishing  the  Lanczos  algorithm  in  the  context 
of  exact  arithmetic  we  want  to  bring  out  the  underlying  subspaces  and 
those  which  reflect  a  particular  basis  in  the  space.  We  propose  that 
the  basic  algorithm  of  this  chapter  be  called  the  two-aided  Lanczos 
algorithm  to  distinguish  It  from  its  better  known— and  stable— version 
for  synmetrlc  matrices.  In  the  synmetric  case  the  temptation  to  iden¬ 
tify  Fn  with  Its  dual  R*  Is  too  strong  to  resist  and  the  algorithm 
simplifies  significantly  In  exchange  for  identifying  objects  which  are 
logically  different. 

The  final  sections  of  this  chapter  seem  to  be  out  of  place  being 
motivated  by  considerations  such  as  avoidance  of  overflow  or  underflow 
in  computer  implementations.  In  exact  arithmetic  the  particular 
scaling  of  the  Lanczos  vectors  is  of  no  consequence;  In  practice  it 
does  matter.  We  give  a  thorough  discussion  of  the  subject  and 
recommend  a  novel,  and  slightly  redundant  formulation. 
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The  material  covered  in  this  chapter  is  not  new  so  that  the 
presentation  is  less  formal  than  might  be  expected.  Also,  the  reader 
is  assumed  to  be  familiar  with  the  Gram-Schmidt  process  for  ortho- 
normalizing  a  sequence  of  vectors. 


1.2  Krylov  subspaces  and  sequences 


Given  non-zero  q  €  €  ,  p*  €  C*,  the  Krylov  matrices  K^  and 
K^  are  defined  by 


K*  =  K^q.B)  =  [q,Bq, . . .  ,B^q] 

V 

K  =  =  P:B 


The  columns  of  K^  form  the  Krylov  column  sequence  generated  from  q. 
Similarly,  the  rows  of  K*  from  the  Krylov  row  sequence  generated 
by  P*. 

These  column  and  row  sequences  are  the  primary  vectors  which 
determine  the  column  and  row  Krylov  subspaces  defined  by 

KZ  =  K*(q,B)  =  span  K^  *  K^C* 

K*  -  kJ(p#,B)  =  span  K*  *  C*K* 

These  subspaces  play  the  central  role  In  the  understanding  of  the 
Lanczos  algorithm. 

0 

The  fact  that  B  q  converges  to  the  dominant  eigenvector  of  B 
as  Is  misleading  in  the  context  of  Lanczos.  There  is  no 

Interest  in  letting  i  exceed  n  (in  the  symmetric  case,  i  is 


6 


typically  about  3vfT)  and  an  important  topic  in  approximation  theory  is 
derivation  of  expressions  which  measure  the  closeness  of  certain  eigen¬ 
vectors  to  K9'  and  K*. 

If  K  =  K  then  it  is  easily  verified  that  is  invariant 
under  B.  Such  subspaces  are  what  we  want,  and  there  is  no  loss  in 
assuming  that  we  have  not  achieved  our  goal,  i.e.  we  may  assume  that 

dim(/Ci)  =  dim(K*)  =  l  . 


For  theoretical  purposes  the  columns  of  K^  (or  the  rows  of  K*) 
form  a  satisfactory  basis  for  K9,  (K*)  and  show  the  key  role  of 
polynomials. 


LEMMA  1.1  There  is  a  one-to-one  correspondence  between  K  (q,B)  and 

2.-1  _• 


\  i  3  Mt),  n(t)  »  l  ir^t1}.  For 
2-1  .  i»0  1 

t  /  D  Irt  S  V  (  D  n  if  2  n  J  .y,  . 


each  tt  £  P„ 


there  is  a 


ir(B)q  3  \  (B  q )tt -  €  K2,  and  vice-versa.  Similarly ,  p*tt(B)  6  K*. 
i*0 


1.3  Two-Sided  Gram-Schmidt  (TSGS) 

In  his  original  paper  (Lanczos  [1950]),  Lanczos  remarked  how 
round  off  errors  made  the  Krylov  vectors  useless  for  computation.  He 
proposed  better  bases  for  K9,  and  K*  by  applying  the  Gram-Schmidt 
process  to  the  Krylov  vectors.  This  produces  a  biorthonormal  pair  of 
bases  {q^,q2,...,qA>  and  {p*,p£,... ,p*>  in  each  space.  The  nature 
of  the  Gram-Schmidt  process  forces  q^  to  be  the  component  of  B^_1q 
orthogonal  to  and  p?  the  component  of  p*B^  orthogonal  to 

J 

fC^.  The  algorithm  then  is  as  follows: 
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Select  p*  and  q  so  that  p*q  *  1.  Set  p1  *p*,  q^  =  q. 
For  j  »  1 . .  ,i,-l 


where  g(j+1)y(j+1) 


Jj+1)-s*  r 

“  •  sj+irj+r 


d.i) 


Note  that  q.+^  is  the  unique  vector  (to  within  scaling)  in 
r>+1  orthogonal  to  <£.  Similarly,  Pj+1  is  unique  in  4+1  ortho¬ 
gonal  to  ff*.  The  specification  of  3^J+^  is  postponed  until  later. 
For  now,  it  is  simply  a  non-zero  scalar. 

It  Is  convenient  to  regard  these  Lanczos  vectors  as  columns  (or 
rows)  of  matrices. 


Qt  3  » •  •  •  ^3 


*  "T 


with  P*Q^  «  1^  by  construction. 

Note  that  the  Krylov  vector  Bk_1q  is  not  needed  until  the  kth 
Iteration  in  (1.1).  In  fact,  the  Krylov  vectors  are  not  needed 
explicitly  at  all.  At  step  j  in  (1.1)  ^q  and  p*B^  ^  can  be 
replaced  by  Bq^_^  and  respectively.  To  see  this  we  use  the 

result  of  the  previous  section  that  K^_1(q,B)  *  {ir(B)q:  irSpj_2^-  In 
particular,  q^  *  <$>(B)q  where  the  degree  of  <J>  *  j-2  (otherwise, 
q4  ,  would  be  in  K^"2). 
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span{q^  , . .  •  ,q .  ^  ,Bq j }  =  span {q^  , . . .  ,q j , B<j> ( B ) q } 

i-1  ~  i  2 

=  span{q1,...,qj_1,4ij._2BJ  q+<j>(B)q)  $er' 

=  span{q^  , . . .  *_£B "  d) 

■*1  • 

*  span{q1,...,qj._1,BJ”  q}  -  KJ(q,B) 

Similarly  for  pt  .B  and  p*BJ”  . 

J  * 

The  algorithm  (1.1)  is  then  replaced  by: 

For  j  =  1,2,... ,£-1 

rj+i  *  8<,i '  j1qi(p*BV 

sj+1  *  PjB  ‘  1^1'(Pj8q1)P*  (1.2) 

qj+l  =  rj+l/8j+l 
Pj+1  “  Sj+1/Yj+1 

where  BJ+1Yj+1  ■  «J+1  J+r 

The  beauty  of  (1.2)  is  that  the  sums  simplify  to  only  two  non-zero 
terms  as  shown  in  the  following  lemma. 

LEMMA  1.2  p*Bqj  *  pjBq1  *  0  for  i  <  j-1. 

PROOF.  We  will  only  consider  PjBq^  since  the  argument  is  the  same 
for  both.  Consider  q^  =  tt. (B)q,  it..  €  P1”1 .  Then  Bq^  »  Btt^ (B)q  * 
^(Bjq,  €  p^.  Then  from  Lemma  1.1,  Bq^  €  K*+<*.  By  construction, 

p*  1  Klf  i  <  j.  Therefore,  pj  1  Ki+1,  i+1  <  j.  Thus,  PjBq^  *  0, 

i  <  j-1.  ■ 


We  isolate  the  non-zero  coefficients  by  the  following  notation: 


pJBqM  (=  ph> 

pj-iB,j  (=  sjqj) 


pjBqj 


The  familiar  three  term  recurrence  then  is: 


<0  =  p0  =  0 


=  q/6-|  .  p*  =  p*/y  ;  ^y-j  s  w-j  *  p*q 


For  j  »  1 ,. . . ,1-1  do 


Bqj  ■  qjaj  •  qj-lYJ-l 

PJ*B  -  Vj  -  SJPJ-1 


where 


6j+lYj+l 


rj+l/Bj+l 

Sj+1/Yj+1 


“j+l  *  sj+lrj+l 


(1.3) 


(1.4) 


Clearly,  If  *  0  the  algorithm  cannot  continue.  An  analysis 
of,  and  a  remedy  for,  this  condition  Is  the  purpose  of  this  work.  For 
the  moment,  though,  we  will  assume  that  uj  t  0  for  i  ■!,..., J+l. 

This  ends  the  trandltlonal  description  of  the  (two-sided)  Lanczos 
algorithm.  The  discussion  of  the  problem  of  termination  is  postponed 
until  our  stabilizing  algorithm  is  presented  (Chapters  II  and  III).  We 
now  illuminate  various  relationships  which  govern  the  coefficients  a, 

6  and  y,  and  the  Lanczos  vectors  {p?}  and  {q . > . 


1 .4  The  Lanczos  polynomial  and  the  moment  matrix 

From  the  previous  section  we  have  qk  *  tt(B )q  where  it  is  of 
degree  k-1 .  It  is  convenient  to  specify  n  as  follows 


qk  =  (J[FT)xk-l(B)q 


(1.5) 


(kl 

where  xk_-|(t)  is  a  monic  polynomial  of  degree  k-1  and  Bv  '  is 
defined  in  (1.1).  The  three  term  recurrence  (1.3)  for  the  q.-'s  yields 
a  related  recurrence  for  the  Lanczos  polynomials  x-j*  Set  X_-|(t)  =  0» 
Xq( t)  *  1.  Then,  by  substituting  (1.5)  in  (1.3)  one  obtains 

Xk(t)  *  (t-ak)xk_-|(t)  -  “kX|c_2^t^  0-6) 

where  ak  and  u>k  were  defined  in  (1.4).  Similarly, 

pk  ■  (~nrr)p*xk-i(B>  • 

Ikl  fkl 

Moreover,  8  ’  *  B-|‘B2 . 6k  and  Y  =  Y-j  *Y2 . Yk*  Further,  it 

is  the  product  (=  BkYk)  which  determines  the  Lanczos  polynomials. 
The  choice  of  0k  only  affects  the  norms  of  qk  and  p£. 

Next  we  relate  the  Lanczos  polynomial  to  a  certain  triangular 
matrix.  Recall 


Kj  *  [q,Bq,...,Bj-1q]  ,  K*  =  P*B 


The  two-sided  Gram- Schmidt  process  dictates  that 


Qj  *  Cql**--*qj]  '  KjLj  V 


(1.7a) 


where  U*  is  some  j  *  j  unit  upper  triangular  matrix  and 

J 

Ap  *  diagte^1  ^ ,. . .  ,8^},  and,  similarly. 


(1.7b) 


where  A  *  diag{y^,...  ,y^)  and  is  unit  lower  triangular. 

Note  that  the  inverse  of  a  unit  lower  triangular  matrix  is  also 
unit  lower  triangular.  Thus,  the  ith  row  of  LT1  contains  the  coeffi- 

J 

cients  of  the  (i-l)st  Lanczos  polynomial  x^_-|  since 


*  (_nr)()<o1’1),xi(1’1) 


th  -★ 

Similarly  the  1  column  of  Lj  contains  the  coefficients  of  the 
(i-1)  Lanczos  polynomial,  so  that  Lj  =  Lj  . 

Note,  now,  that 


KJKj  »:  MjCp^.q.B)  (1.8) 

the  moment  matrix  whose  (1,k)th  element  is  ^q.  Using 

(1.8),  along  with  a  rearrangement  of  (1.7a)  and  (1.7b)  and  P*Qj  *  Ij, 
we  have 

"j  * 

■  wtw-j  ■  w; 

where  flj  *  A^Ag  ■  d1ag{ui^  •  •  »w-ju>2'  That  ^s»  running  the 

Lanczos  algorithm  for  j  steps  Is  equivalent  to  the  triangular  factori¬ 
zation  of  the  moment  matrix  Mj.  In  particular,  a  breakdown  in  the 


Lanczos  algorithm  (u. *  0)  corresponds  to  failure  in  the  triangular 
factorization  and  vice  versa.  We  may,  if  it  helps,  consider  the 
problem  of  stability  in  terms  of  the  extensively  studied  triangular 
factorization. 

As  noted  before,  these  relationships  are  not  new  (Lanczos  himself 


used  Gaussian  elimination  in  his  original  algorithm),  but  neither  are 
they  widely  comprehended. 

1.5  Matrix  formulation  of  the  Lanczos  Algorithm 

The  three  term  recurrence  can  be  written  compactly  by  the  intro¬ 
duction  of  a  tridiagonal  matrix,  J.  Consider  algorithm  (1.3),  and 
write  r^  =  q^  and  s|  =  Y^P**  Then 

[i"2* . . .  »r^+-j ]  a  [ Bq-j -q-j a.j , Yg » • • • » 1 
and 


s2 

• 

3 

’  p'B-c^p* 

• 

• 

• 

L  sk+l  J 

• 

-  PkB-Vk-¥>k-i  - 

becomes 


A  simplified  notation  for  the  rank  one  matrices  simplifies  the 
matrix  formulations  (1.10a)  and  (1.10b)  to 

rk+lek  =  BQk  “  QkJk 

Vk  =  ?kB  •  JkPk 

with  the  convention  that  the  ek  is  the  last  column  of  the  k*k 
Identity  matrix,  Ik,  while  the  other  vectors  are  of  dimension  n. 

Finally,  we  relate  Jk  to  the  Lanczos  polynomials  by  considering 
Its  characteristic  polynomial,  det(tIk-Jk).  Expanding  by  the  bottom 
row,  we  find 

det(tIk-Jk)  -  (t-ak)det(tIk.rJk.,)  -  (-BkX-Yk)det(tIk.2-Jk_2)  . 

Further,  det(tI1«J1)  ■  t-a-j  »  x-j(t).  Recall  that  Bkyk  *  a)k  and 
compare  with  (1.6)  to  see  that 

is  the  characteristic  polynomial  of  . 


7~7-7 
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Another  approach  to  formulating  some  of  these  results  comes  from 
the  theory  of  orthogonal  polynomials  with  respect  to  an  Inner  product. 
In  our  case  the  are  orthogonal  with  respect  to  the  Inner  product 
Induced  by  the  moment  matrix  M{p*,q,B)  (see  Brezinskl i  [1980]).  But 
we  will  not  make  explicit  use  of  this  viewpoint,  since  M  is  not 
guaranteed  to  be  positive  definite,  and  one's  intuition  may  be  misled 
by  the  improper  inner  product  induced  by  it. 

1 .6  The  moment  matrix  and  the  Lanczos  polynomial 

Up  to  this  point,  we  have  linked  the  moment  matrix  to  the  Lanczos 
algorithm  but  not  to  our  goal  of  finding  some  eigenvalues  of  a 
nonsymmetric  matrix.  The  dependence  of  approximate  eigenvalue  on  only 
the  moment  matrix  will  be  of  importance  in  later  chapters. 

Let 


M 


1j 


m1  rai+l 


m. 


where  m^ 
LEMMA  1.3 

PROOF. 


and 


Lmj  mj+l  •**  "^j-i+lJ 

P*B  q.  Then 

Xk(t)  *  detCtHj^-j-Mj 


PK 

A^L^KjB^L^Ag1  from  (1.7a),  (1.7b) 

.-I,  “1 m  I -*A-1 
Ay  Lk  M1 ,kLk  aB 


!k  * 

*  ^W.k-lhc  V 


U)«j  •  •  *U)j  . 


tonUW.  •  \mmi  m  ■  i 


J 


1 


Since  Lk  is  unit  lower  triangular 

Xk(t)  *  det(tIk-Jk) 

•  det(A^' )det( tMQkl -M1 ^JdetfA"1 ) 

•  det(tM0>k.1-M1>k)/(u(1)--.a)(k>)  .  • 

The  moment  matrix  M.  .  Is  a  special  Hankel  matrix  (the  (k,t)th 
element  is  a  function  of  (k+i,))  and  this  fact  yields  another  deter- 
minental  description  of  the  Lanczos  polynomial. 

LEMMA  1.4 

mj  •  •  •  mk+-| 

•  TT)' 1  (k)  det  :  i  d-") 

Vl  \  “•  "^k-l 

1  t  •  • .  t^ 

PROOF.  The  (k+1)  *(k+l)  matrix  on  the  right  of  (1.11)  can  be  expressed 
as 

'Vk-i  ”(k)' 

.(lt-.-t^1)  tk 

Observe  that 


l( 

Take  determinants,  cancel  t  ,  and  use  Lemma  1.3  to  obtain  the 
formula.  ■ 
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1.7  Off-diagonal  elements  of  J 

Up  to  now,  no  mention  of  the  selection  of  B.-  (and  thus  y.)  has 

J  J 

been  made.  In  exact  arithmetic  no  consideration  is  necessary,  since 

the  directions  of  the  p.’s  and  the  q.'s  are  the  determining  factors. 

J  J 

In  finite  precision,  this  is  not  the  case. 

Such  practical  considerations  may  appear  out  of  place  in  our 
discussion  of  the  two-sided  Lanczos  process,  but  are  necessary  and 
propert.  The  problems  of  stabilization  must  be  attacked  in  the  con¬ 
text  that  they  are  encountered,  not  in  the  ideal.  Thus  we  assume  that 
we  are  now  working  in  finite  arithmetic  and  must  adjust  accordingly. 

We  certainly  wish  to  avoid  extremes  in  the  selection  of  B.. 

J 

Taking  8-  to  be  the  largest  machine  number,  for  example,  is  unrea- 

J 

sonable.  Not  so  obvious  is  the  innocent  choice  of  B-  =  1  as  the 

J 

following  example  shows. 


EXAMPLE  1.1 


'“i  2‘ 


°2  •• 


.  .  2' 

•  «  t 

2"m  CL 


er  qi 


where  m  is  some  small  positive  Integer  (say  4)  and  ct1  is  arbitrary. 
It  is  easily  verified  that  the  Lanczos  algorithm  produces 


n*  _  2^"^ 

PZ  ~  c  ez 

h  - 


3 


J 


l 


al 


1 


2~2m 


1 


From  this  example  we  see  that  even  for  a  symmetric  matrix,  the 
sacrifice  of  synmetry  causes  exponential  growth  in  the  elements  of  one 
set  of  vectors  and  exponential  decline  in  the  other. 

If  synmetry  is  such  a  desirable  property,  perhaps  the  selection 
of  *  |yj|  *  /|u)j  |  would  be  better.  As  It  turns  out,  we  gain 
nothing  over  *  1  as  the  following  example  demonstrates. 


EXAMPLE  1.2 


ra,  f1 

2"m  a2  2m 

2-m 

•  • 


.  \  *2m 


•  9 


'  2"m  CL, 


,  p]',  q-j,  m  and  as  before 


Here  again  p*  *  2^“^meJ  and  q.  *  2^"^me(,  while 


J 


J l 


•  •  1 
•  •  • 

•  • 

1  “n 


So  the  entity  of  Interest  is  again  not  the  resulting  matrix,  but 
the  bases  with  which  we  are  dealing. 
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The  risk  of  element  growth  (or  decline)  can  be  reduced  by  forcing 
the  norms  of  the  resulting  vectors,  p*  and  q^,  to  be  equal.  This 
criterion  forces 

*  (luJIr^/ISjl)1'2 
-  s1gn(Ul)(|uit|lslll/lr|ll)1/2 

Since  •  s*rt 

Ipjl2  -  Iqj2  -  (l^l/tlrjllsjl))-1 

*  (cosWsJ.r^)))"1 

*  sec(/.(sj,rt))  -  sect^fPj.q^)) 

Therefore,  the  norms  of  each  vector  is  not  less  than  one  and 
becomes  large  only  as  the  vector  pair  (p^.q^)  approach  orthogonal. 
Further  if  B  is  syimnetric,  the  process  reduces  (with  p1  a  d-j)  to 
the  symmetric  Lanczos  algorithm. 

1.8  The  generalized  problem 

An  alternative  to  allowing  any  growth  in  the  elements  of  p*  and 
q4  Is  to  force  the  norms  of  these  vectors  to  be  unity.  We  then  have 
that  p£qA  *  cos(Z.(p£,qa)).  In  terms  of  the  ultimate  goal  of  our  work, 
we  have  generalized  the  problem  to  finding  eigenvalues  for  the  matrix 
pencil  where 

h  ■  pts\ 

and 

*1  *  d1ag{*y¥2,...,*£}  *  PjQ^  . 
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Pictorial ly,  (1.10a)  and  (1.10b)  become 


r*+leJl 

= 

e£SJl+l 

it 

— 

\  'rt 


The  three  term  recurrences  (1.3)  then  are 


Vi  * 


s*.l  *  P*B  -  - 


Vi'Wi1 

<Wi)pm 


JJl+l  ■  sJL+1rjl+l 


WVl1’  PW  *  SU1/ISMl' 
pt+lB<,t  ■  “nV*st+l 1 


't'Vl  "  WW 


(1.12a) 


(1.12b) 


puiVi  a  c°s(z.(p*+i *qR>+i ) )  -  Vl/(isj+1nVli) 


1 .9  Summary 

The  insight  that  comes  from  relating  the  Lanczos  algorithm  to  the 
moment  matrix  is  this:  once  p*  and  q  are  chosen  the  success  or 
failure  of  the  process  is  determined.  If  any  of  the  moment  matrices 
Mj  is  singular  then  the  algorithm  will  halt  at  step  j-1  with  *  0. 
If  either  r j  1  0  or  Sj  »  0*  then  an  invariant  subspace,  our  goal, 
has  been  captured;  otherwise  (i.e.  Wja0,  sjj<0  and  r^O),  the 
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choice  of  p*  and  q  was  unfortunate.  It  could  happen  that  no  eigen¬ 
value  of  Jj  is  close  to  an  eigenvalue  of  6  and  in  such  a  case,  the 
effort  seems  wasted.  This  is  called  serious  breakdown. 


The  foregoing  analysis  shows  that  the  Lanczos  scheme  is  too  rigid 
to  be  stable.  The  great  practical  advantage  is  that  the  projection  of 

0  o 

B  onto  K  and  K*  is  tridiagonal.  If  these  spaces  contain  good 
approximations  to  the  desired  eigenvectors,  then  the  computation  of 
these  approximations  requires  the  calculation  of  certain  eigenpairs  of 
an  £x£  tridiagonal  matrix,  a  relatively  easy  task. 
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The  Look-Ahead  Lanczos  Algorithm 


2.1  Introduction 

The  ideas  presented  in  Chapter  I  not  only  establish  the  Lanczos 
algorithm  from  a  general  mathematical  perspective,  but  lay  the  founda¬ 
tion  for  the  modifications  that  give  it  more  flexibility.  That  the 
subspaces,  and  not  the  particular  bases,  are  important,  allows  a 
modification  of  the  Lanczos  process  which  furnishes  a  potentially 
powerful  tool . 

The  "look-ahead"  Lanczos  algorithm  presented  below  is  a  way  to 
relax  the  two-sided  Lanczos  algorithm.  As  in  Chapter  I,  the  two-sided 
Gram- Schmidt  process  and  the  factorization  of  moment  matrices  play 
important  roles  In  the  understanding  of  our  new  algorithm. 

We  will  assume  here  that  ■  Rn  and  kj  =  rJ.  This  is  not 
necessary  for  the  study  but  simplifies  the  presentation.  Tn  those 
places  where  K11  *  Rn  is  assumed,  we  might  just  as  easily  assume  a 
sufficiently  large  invariant  subspace,  but  such  conditions  add  nothing 
to  the  presentation  and  conceal  key  ideas. 


2.2  Breakdown  and  the  Two-Sided  Gram-Schmidt 

To  understand  breakdown  in  Lanczos  and  our  remedy  for  It,  it  is 
necessary  to  focus  on  the  generation  of  two  inter-related  sets  of 
vectors.  The  sets,  related  by  bi orthogonality,  form  bases  for  the  row 
and  column  spaces  and  Kn,  respectively. 

First,  though,  let  us  consider  the  general  case.  Let 
F*  »  span{f.j . ,fkh  <£  ■  span{g.j ,. . . ,gk>  with  F11  -  Rn  and 
*  Fn.  Define  the  matrices 


where  the  columns  of  Fk  form  the  primary  vectors  of  F  and  the  rows 

k 


of  G£  form  the  primary  vectors  of  G*.  Note  that  rank(F^)  * 
rank(Gjp  =  k. 


k-1  k-1 

Applying  the  two-sided  Gram-Schmidt  process  to  F  and  G*  , 


we  get 


aI/.i  /s  a  k— 1 

r  1  =  span^ . fk>  =  Fk  ' 


G^-1  =  span{gr...,gk>  =  G*"1 


where  f^  €  F^,  g*  e  G*,  If..  I  =  Ig^l  *  1,  and  g*fj  =  5.^  i|>.  with 

A  4  4 

=  cos  Z.(g^,f.  )•  Next  we  apply  TSGS  to  F  and  G*  by  forming 


k-1 


\  ■  ‘V^VW 


k-1 


(2.1) 


and  then  normalize  to  obtain 


W'V  • 

5k  *  WK*  • 

\  *  «?»  • 

Note  that  neither  ?k  nor  §k  is  zero  since  the  f's  and  g's  form 
bases  for  Fn  and  F*  (this  point  will  be  elaborated  below). 

Let  us  assume  that  g*fk  s  0  and  thus  i|>k  =  g£fk  =  0.  We  cannot 
continue  the  Gram-Schmidt  process  since  that  would  involve  division  by 
zero  in  (2.1).  A  different  pair  of  sets  may  be  selected  to  replace 


and  {g-|»...,gn}  since  the  latter  pair  proved  unsatis¬ 
factory.  Unfortunately,  we  have  no  more  guarantee  of  success  with  a 
new  pair  of  sets  than  we  did  with  the  original  pair. 


EXAMPLE  2.1:  n  =  3 


and 


'1  0  O' 
0  1  0 
0  0  1 


G3 


'1  0  T 
0  0  1 
1  1  0 


[.5,0, .5] 


'  0  ' 

'  1 ' 

*  0  ' 

1 

-  0.* 

0 

= 

1 

_  0  _ 

0 

m  m 

.  0  _ 

[0,0,1]  -  0*[.5,0, .5]  =  [0,0,1] 


2.3  Look-ahead  in  Two-Sided  Gram- Schmidt 

The  defect  with  the  standard  TSGS  does  not  lie  with  the  primary 
vectors  {f^,...,fn>  and  {g*,...,g*>  since  these  form  bases  for  Fn 
and  F*.  Rather,  one  (or  both)  sequences  of  vectors  was  used  in  an 
unfortunate  order.  For  each  f .  there  must  be  some  m  >  j  such  that 

v 

g*fj  t  0.  We  formalize  this  concept  in  the  following  lemma. 

LEMMA  2.1.  Let  Fk  *  span{f^ ,. . . ,f^>  and  G^  *  span{g*,. . . ,g£>  with 
F°  *  Fn  and  G^  *  F*.  Let  f^  be  defined  by  If^l  =  1,  f.  €  F1, 
fj  1  G*"^,  and  let  g*  be  defined  by  |g*I  =  1,  g.  6  G*,  §*  1  F^ . 
Further,  for  i  <  k,  let  ^  *  g*f^  t  0.  Then  there  is  an  m  ^  k 
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g IK  t  o 


where  g*  €  Gj,  g*  i  Fk_1 


PROOF.  First,  since  Fk  =  [f i » . .  -  has  rank  k  and  fk  has  a 

non-zero  component  in  the  direction  fk  (from  (2.1)),  fk  =  Fkv  t  0. 
'-★"1 

*  9l 

Now,  6  =  :  has  full  rank,  so  G*f.  f  0.  Thus  there  is  some  m 

*  i  •  n  k 

kJ 

■  A  A 

such  that  g  f.  f  0.  By  construction,  g*f.  =  0  for  l  <  k,  so 
m  k  k_^  X,  K 

m  >  k.  Let  g;  -  gj-  ^(g^V*,)^-  By  "ot1n9  that  9mfk  *  <fk  f  °' 
the  result  follows.  ■ 


Thus,  on  breakdown,  we  can  switch  g^  and  g£  and  continue 


the  process. 


2.4  The  look-ahead  scheme  and  subspaces 

What  effect  does  the  switching  of  primary  vectors  have  on  the 
subspaces?  The  exchange  of  g*  with  g*  can  be  considered  a 
reselection  of  subspaces  of  G^.  Thus,  for  i  *k,...,m-l 

g!  becomes  §1  *  span{g* . g£_1,g*,g£+1»....g*}  . 

For  all  other  i,  G*  remains  unchanged. 

This  dynamic  interpretation  may  not  preserve  all  properties  of  the 
original  subspaces  (such  as  BK^  c  in  the  case  of  Krylov  sub¬ 
spaces).  We,  therefore,  present  the  following  Interpretation. 

When  gk?k  *  0,  we  find  the  first  g*  such  that  g*?k  t  0, 
k*l  ^  ^ 

generate  *  g*  -  ^ (g*f  1/^1  )§i ,  and  normalize  ?k  and  gj  to 

form  ?k  and  g£.  Now,  in  place  of  §k  £  g£,  we  have  g£  £  G*. 


Further,  for  i  =  k+l,...,m  (from  2.1), 


since  g£  is  included  in  the  sum.  In  other  words,  instead  of 

generating  one  row  basis  vector  from  each  g£,  we  are 

generating  (m-k+1)  row  basis  vectors  from  g£. 

The  set  of  vectors  we  finally  generate  is,  in  fact,  a  basis  for 

the  subspace  g£\<3*_1  c  gJ  (i.e.,  G^NG*"1  is  the  subspace  of  (Q 
k-1 

orthogonal  to  F  ).  Thus,  a  solution  to  serious  breakdown  in  TSGS 
consists  of  selecting  a  basis  from  a  larger  subspace. 


2.5  The  look-ahead  scheme  and  matrices 

We  do  not  need  to  restrict  ourselves  to  only  changing  the  row 
subspaces.  We  can,  in  fact,  "look-ahead"  in  both  row  and  column  sub¬ 
spaces  and  select  bases  from  g£\<£-1  and  Fm\Fk"1. 


If  we  let  H* 


and  C  =  [c1 


»cAL  l  *  m-k  +  1 ,  with 


h 

c 


* 

1 

i 


gk-i+i  *  i^k-i+iV^^j 

fk-i+l  ‘ji/^k-i+l^j^j 


then  any  bases  we  choose  have  the  matrix  representation 


V*H*  for  row  vectors 


CU  for  column  vectors 


for  some  Invertible,  z*l  matrices  U  and  V*. 

Further,  if  we  let  N  =  H*C,  the  "connection"  matrix  of  inner 
products,  and  h*  =  e*V*H*  and  c.  =  CUe..,  then 

N  =  V'^U"1 

where  ¥  =  diag{h*c^ ,. . . (assuming  bi orthogonality  of  the  h*'s 
and  c. ’s).  Thus,  to  each  selection  of  bases  vectors  corresponds  a 
particular  factorization  of  the  inner  product  matrix  N. 


2.6  Two-Sided  Gram-Schmidt  and  LDU  factorization 

We  have  related  the  "look-ahead"  scheme  to  some  factorization  for 
a  matrix.  We  wish  now  to  weld  this  concept  onto  our  LDU  factorization 
of  the  moment  matrix.  Again,  we  will  discuss  general  spaces  and  will 
return  to  our  actual  objective,  Krylov  subspaces,  in  the  next  section. 

Consider  the  modified  two-sided  Gram-Schmidt  process,  i.e.  at 
step  k. 


f 

9 


l 

* 

l 


*  VVi(gwfA-i) 

9I ‘  *  9t"  ^k-l^k-l^k-l 

\  fk/lfk,f  §k  9k/,9k1,  \  9k^k  * 


l  >  k 

l  >  k 


So,  by  construction  g*fj  *  gjf^  *  0  for  j  <  k,  l  >_  k  (here  g* 
and  f^  denote  vectors  updated  at  each  step). 
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In  matrix  notation  this  becomes 


r  -i  1 

r"1  Lk  L  _ 

k  Ek  ;  ’n-k 


F  K 1  zk  i#-i 

Fn-‘  -  7— *k 

1  An-k 


-A  A 

. fk’fk+l” 


•V  '  Fk 


where 


rk  *  diag{lg*l,...,lg£l,l . 1> 

and  *k  »  diag{lf-jl,...,lf|{l,l,...,l> 


r,-i  1  Tu-1  z  1 

•1 4  c(d  uk  zk  4-i 

;  Ek  ’n-k  Vk 


where 

'•ft.  *rj*i  -  «r.  ' 

C(J)  .  9I+1fJ  9!+l f j+l  ■"  9!+lfn 

w  *  •  •  • 

•  •  • 

g*?4  g?f4.i  •••  g?f* 

/n  j  3n  j+1  3n  n 

Is  the  matrix  of  Inner  products  at  step  k. 

Here  we  have  just  reiterated  the  correspondence  between  TSGS  and 
the  LOU  factorization  of  C^.  Now,  to  what  does  the  look-ahead  scheme 


correspond? 
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As  shown  in  the  previous  section,  the  extraction  of  bi orthogonal 

£+1  k  o+i/ 

basis  vectors  from  G*  \G  and  F  \F  corresponds  to  the  factori¬ 
zation  of  an  l  x  a  inner  product  matrix.  This  £*£  matrix  is  the  2,^ 
principle  submatrix  of  C'kK 

The  matrix  interpretation  of  the  look-ahead  scheme  is  as  follows: 

We  perform  Gaussian  elimination  for  k-1  steps  and  then 
encounter  a  zero  pivot.  We  do  not  wish  to  use  either  partial 
or  complete  pivoting  for  various  reasons  (e.g.  not  all  ele¬ 
ments  are  readily  available).  Following  Kahan  (Parlett  and 
Bunch  [1971])  we  prefer  to  generalize  the  notion  of  a  pivot, 
from  a  scalar  quantity  to  a  matrix.  We  then  search  for  a 
suitably  well -conditioned  principle  submatrix,  and  using  an 
appropriate  factorization,  use  that  as  our  pivot. 

Pictorial ly,  the  final  factorization  is  then 


C 


(1) 


I 


2.7  Lanczos  and  look-ahead 

We  now  have  a  method  to  remedy  the  breakdown  of  the  two-sided 

|( 

Gram- Schmidt  process.  To  interpret  this  for  Lanczos  we  replace  F 
with  Kk  and  G*  with  K*. 

As  we  have  seen,  the  selection  of  bases  vectors  corresponds  to  the 
factorization  of  a  particular  matrix.  In  the  Lanczos  process,  it  is 
convenient  not  to  use  the  principle  submatrix  of  the  moment  matrix 
(matrix  of  Inner  products)  but  to  use  a  scaled  version.  This  will  become 
clear  as  we  present  the  look-ahead.. 
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3 


3 

I 


S 

> 

i 


Let 


and 


*  j  “1 

-1 

■p* 

Lk 

p*B 

•Zk  !n-k+l  - 

• 

K  =  [r^ , . . . ,r|( ,rk+j .... ,?n]  -  [q»Bq,...,B  ^q] 


4 


i 


4 

n-k+1 


where  Zk  Is  such  that  5J  is  orthogonal  to  Kk_1  (and,  by  symnetry, 
is  orthogonal  to  K^’1)  for  j  >.  k.  Then 


n, 


Lzk  Jn-k+lJ 


I  Z  V^I 
L^k  Y  An-k+l 


k-1 
0  M 


0 

n-k+1  -J  L 


Lt  Zk 


I 


n-k+1  -1 


I 


k-1 

0  fi_ 


Lk 


Zk 


(k). 


n-k+1 J  L  6  !n-k+l J 


with  S2k_1  *  diag(u)^ . and  Lk  and  L*  no  longer 

with  unit  diagonals.  In  Lanczos,  it  is  the  principle  submatrices  of 

^n-k+1  rather  than  those  of  Mn_k+^  with  which  we  have  interest. 

We  thus  consider  the  1  *  1  principle  submatrix  of  M  ....  <*>..  If 

n-x-N  K 

a>k  *  0,  we  try  the  2x2  principle  submatrix  of  Mn_k+-j  and  so  on 
until  we  have  a  suitable  pivot. 
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2.8  Effects  of  the  look-ahead  scheme  on  the  J  matrix 

We  must  pay  a  price  for  stepping  outside  the  strict  sequence  of 
Krylov  subspaces. 

For  this  section  we  need  to  adjust  our  notation.  Let  i  denote 
the  ith  step  of  the  "look-ahead"  algorithm  whether  the  pivot  is  lxl, 
2x2,  or  larger.  Let  l  denote  the  actual  vector  index  and  let 

r„*  I 

pj i 

p*  ■  a"d  «i  *  . Vk-i] 

-pu k-i - 

whenever  a  kxk  pivot  is  used.  Then 

P*1 

Jm  ■  •  B»1 . V 

kj 

r^Mi  -  PX1 

*  :  : 

P*BQ,  •••  P*BQ_ 

'-mi  m  mJ 

The  P's  and  Q's  enjoy  the  same  orthogonality  properties  that  the  p's 
and  q's  did  in  the  two-sided  Lanczos  algorithm.  That  is 

P*Qj  "  0  ,  i  t  j  . 

Thus,  Jm  reduces  to  block  tri diagonal 

1 


( Bj  here  denotes  upper  case  8);  the  dimensions  of  the  blocks  being 


determined  by  the  dimensions  of  the  P's  and  Q's. 


2.9  Subspaces  for  look-ahead  Lanczos 


Recall  that  factorization  of  the  pivot  matrix  corresponds  to 
selection  of  a  pair  of  bases.  The  factorization  used  needs  to  hold  some 


advantage  over  the  infinite  number  of  other  choices;  an  advantage  that 
must  be  reflected  in  the  selection  of  the  associated  basis  vectors. 


Before  continuing,  it  is  necessary  to  indicate  precisely  what  the 
subspaces  are  and  how  they  are  produced.  These  spaces  correspond  to 
G*\Gk  ^  and  F^xF^  of  section  2.6. 

Let  k-1  be  the  number  of  successful  steps  of  the  two-sided 

k-l 

Lanczos  process,  that  is,  we  have  successfully  found  bases  for  K 
k-l 

and  K*  .  Let  us  now  assume  that  breakdown  has  occurred.  We  need  to 

find  m  such  that  and  have  m-k+1  dimensional 

biorthogonal  bases.  How  do  we  find  them? 

First  we  need  a  set  of  primary  vectors  spanning  the  appropriate 

subspaces.  Recall  from  Chapter  I  that  Bq^  e  Similarly, 

Brk  €  Kk+1  since  r*S<k,  r^K*1’1.  Further,  B2rk  e  fck+2, 

3  k+3 

B  rk  GK.  and  so  on.  So,  with  rk  at  hand,  the  representative 

vectors  for  through  K*11  are  obtained  by  matrix  products  with 

B.  To  obtain  vectors  in  Km\Kk'  ^  we  need  only  orthogonal ize  each 
1  k-l 

B  r.  to  K*  .  Such  orthogonal izations  require  that  bases  vectors 

k-l  k-l 

for  (perhaps  all)  K  and  K*  be  available. 

We  can  avoid  keeping  these  bases  around  by  following  the  two-sided 
Lanczos  algorithm.  Let  ?k  be  the  current  residual,  and  generate  the 
rest  of  the  primary  vectors  for 


as  follows. 
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For  j  =  k,...,m-l 

Vi  =  B?J  "  qk-l(Pk-lBVTf,k-l)  (2*2a) 

LEMMA  2.2.  r£  1  K*'1 

PROOF.  ?k  1  by  construction  so  Br*  1  K*~2.  Since  qk_^  1  K*~2, 

?k+l  1  ^  2*  Now  pk-lrk+l  =  pk-lBrk“  Vl^pk-lBV*k-l)  =  0;  thus 

~  lc-1 

rk+1  1  K*  .  The  resu^  follows  by  induction.  ■ 

Similarly,  we  may  generate  by  the  following:  Let  sk 

be  the  current  residual. 

For  j  *  k,...,m-l 


5jtl  =  sjB  '  (5je^k-l/<pk-l  ipk-l  • 


(2.2b) 


2.10  Choosing  orthogonal  bases 

We  now  have  row  and  column  vectors  spanning  the  subspaces  of 
consequence.  From  these  vectors  we  will  generate  bases  to  continue 
our  process.  But  how  shall  we  decide  between  different  bases? 

Assume  that  Q  is  n  *  Jt  (i,*m-k+l)  matrix  whose  column  space  is 
and  P*  is  such  that  P*Q  is  diagonal  and  the  row  space  of 
P*  is  .  There  is  no  loss  in  generality  in  forcing  all  bases 

vectors  to  have  Euclidean  norm  of  unity.  It  follows  that 

1  <  IP*1  <  /I  ,  1  <  IQI  <  vT  . 

A  prevalent  measure  of  the  linear  independence  of  P*‘s  rows  is 
cond(P*)  =  IP*ISP+I  *  o1(P*)/oil(P*) 
where  P  denotes  the  pseudo- inverse  of  P  , 


a-j  (P*)  ^^(P*)  •  >^a^(P*)  >  0  are  the  singular  values  of  P*.  Our 

normalization  ensures  that  vT^a^P*)  and  so  our  interest  focuses 
on  a^P*).  Note  further  that 

o^P*)  <  (a1{P*)---aJl{P*))',/:t  =  det(P*P)1/2  . 

There  seems  to  be  little  prospect  of  estimating  o^P*)  or  a£(Q) 
directly. 

We  try  another  approach.  Let  l  -  n  and  suppose  that  P*Q  = 

58  diag{a.| . an>  with  1  >.a.|  >. ag^.***  >.an  > 0.  In  this  case 

p*  -  qa;1 

cond(P*)  =  IP*IIP*I 
*  IPIIQfl"1! 

<  /n  r/n  o“^  =  n/a 
—  n  n 

This  suggests  that  among  biorthogonal  bases  we  should  prefer  those 
which  maximize 

min  |p*qi|  .  (2.3) 

The  assumption  l  *  n  was  for  simplicity  only.  In  general,  for 
£.  <  n,  P+  *  qn’1  and  so 

cond(P*)  *  IP*IIQft^l  <  IP*IIQIln’1l  <  l/az  . 

To  return  to  the  look-ahead  Lanczos,  we  wish  to  select  bases  which 
maximize  (2.3)  over  k  <_  1  <_  m.  This  is  a  non-trivia!  problem.  We 
will  discuss  the  2><2  pivot  In  Chapter  III  but  leave  the  general  case 
as  beyond  the  scope  of  this  work. 


2.11  Practical  side  of  Divot  selection 


In  the  previous  section  we  discussed  the  selection  of  bases  for 
a  particular  pair  of  subspaces  and  In  particular 

we  must  not  only  select  bases  within  subspaces,  but  between  subspaces 
themselves.  If  say  cos  Z.(r^,sp  were  very  small  but  non-zero,  exact 
arithmetic  would  allow  the  Lanczos  process  to  continue,  whereas  finite 
precision  would  cause  the  Lanczos  process  to  behave  erroneously. 

To  employ  the  cosines  of  angles  between  the  different  possible 
bases,  we  must  first  decide  which  angles  are  important  and  which  are 
not.  We  do  not  wish,  for  example,  to  only  compare  all  possible  bases 
in  K’V-'  and  «5m4-  By  ignoring  the  bases  of  smaller  subspaces, 
we  may  miss  a  smaller  subspace  coupled  with  a  subspace  beyond  considera 
tion  at  step  k  which  would  yield  a  superior  pair  of  bases. 

Instead  we  assume  that  to  each  pivot  block  corresponds  one 
"optimal"  pair  of  bases  and  we  then 


maximize  minimum  |p?q. |  . 


(2.4) 


This  criterion  gives  a  way  to  determine  the  "best"  pivot  at  each  step. 

Of  practical  consideration  is  fast  memory  limitations.  We  need 
to  reduce  memory  requirements  as  much  as  possible.  Further,  the  block 
tridiagonal  form  is  not  imnediately  amenable  to  eigenvalue  analysis. 
Thus  some  restrictions  on  the  form  of  the  J  matrix  are  in  order. 

Recall  that  the  blocks  of  the  J  matrix  are  of  the  form  P*BQj 
where  |i-j|  £l.  If  P*  is  k*n  and  Qj  is  nxm,  P-jBQj  is  k*m. 
Thus,  with  a  maximum  pivot  size  of  Z,  the  bandwidth  of  J  may  be  as 
large  as  32.. 
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We  can  reduce  the  maximum  bandwidth  to  2£+l  by  preserving  the 
order  of  one  Krylov  sequence.  For  example,  if  q..  e  K1  for  all  i, 
p^Bq.  =0,  |i-j|  <  1,  and  the  J  matrix  retains  the  Hessenberg  form. 

2.12  Summary 

We  now  have  a  method  for  stabilizing  the  two-sided  Lanczos  process. 
This  stabilization  preserves  as  much  as  possible  of  the  Krylov  space 
structure.  The  cost  of  our  remedy  may  seem  expensive,  several  extra 
matrix-vector  products,  but  will  be  absorbed  in  use,  as  seen  in 
Chapter  III. 

Further,  (2.4)  gives  a  clear  measure  of  the  superiority  of  one 
pair  of  bases  over  another  and  one  pivot  size  over  another.  The 
differences  in  dimension  of  the  competing  subspaces  is  unimportant. 

That  is,  if,  at  step  i,  we  use  a  kxk  pivot  instead  of  an  l*  l  pivot, 
k  <  l,  the  problem  of  selecting  bases  vectors  at  step  i+1  is  indepen¬ 
dent  of  the  selection  at  step  i,  in  spite  of  the  overlap  of  subspaces. 

Finally,  we  can  not  ignore  the  fact  that  there  is  available  only 
a  limited  amount  of  storage.  The  vectors  we  can  retain  in  memory  and 
the  size  of  the  J  matrix  are  limited.  The  storage  crunch  is  reduced 
by  forcing  one  Krylov  sequence  to  remain  intact.  Further,  by  limiting 
pivot  size  we  can  limit  the  bandwidth  of  J  and  the  number  of  vectors 
that  must  be  kept  in  fast  storage. 

In  other  words,  the  look-ahead  mechanism  allows  us  to  balance  the 
competing  demands  for  well -conditioned  bases  and  limited  fast  storage. 

It  turns  out  that  the  simple  extension  to  allow  2*2  pivots  eliminates 
many  instances  of  bad  bases  without  a  significant  increase  in  storage 
requirements. 


III.  2x2  Pivot 


3.1  Introduction 

In  this  chapter  we  complete  our  discussion  of  the  generalized 
pivot.  The  analysis  is  non- trivial  so  we  confine  ourselves  to  the 
2x2  case  and  leave  the  general  case  to  subsequent  work. 

The  relationship  between  the  pivot  factorization  and  the  bases  is 
exhibited  for  some  familiar  factoring  schemes.  Further,  the  angles 
between  the  subspaces,  &  la  Davis  and  Kahan,  is  presented.  This 
approach  gives  bases  independent  of  the  primary  vectors  and  presents 
us  with  a  tool  for  finding  the  best  bases  in  the  sense  of  Chapter  II 
(section  2.11). 

To  complete  the  discussion  of  the  look-ahead  algorithm,  we  need 
a  criterion  for  judging  approximate  eigenvalues.  Alas,  none  exists. 
However,  Kahan,  Parlett  and  Jiang  have  produced  residual  bounds  on 
approximate  eigensystems  and  we  generalize  their  discussion  of  residual 
bounds  for  Lanczos  to  encompass  the  look-ahead  Lanczos. 

Much  of  the  discussion  refers  to  quantities  defined  In  Chapter  II. 
As  a  brief  review,  recall  s?  and  r.  denote  the  row  and  columns 

J  J 

residual  vectors  at  step  j  with  s?  1  and  ri  1  K*"1.  Further, 

J  J 

the  look-ahead  procedure  determines  bi orthogonal  bases  vectors  in 
(all  vectors  In  K01  orthogonal  to  K*"1)  and  lQ\l cj’1. 

The  pivot  matrix  will  be  defined  explicitly  for  the  two  dimen¬ 
sional  case.  The  explicit  relationship  between  the  pivot  factoriza¬ 
tion  and  the  bases  determined  by  that  factorization  (section  2.6)  will 
be  used  without  being  rederived. 
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3.2  Subspace  considerations 

As  in  Chapter  II,  assume  that  the  two-sided  Lanczos  has  proceeded 
for  j-1  steps  without  serious  breakdown.  That  is  the  bases  vectors 
{q1,...,qJ-_1>  and  {p*,...,Pj_-|>  are  such  that  qi  €  K1  and  p*  e  k]. 
We  now  assume  that  the  current  residuals  r.  and  s^  are  too  nearly 
orthogonal  to  proceed  with  the  Lanczos  process. 

Following  section  2.9,  we  generate  the  primary  vectors  which 
determine  the  subspaces  of  interest,  and 

With  r.  and  s?  already  present,  the  remaining  primary  vectors  are 

J  J 

defined  from  (2.2a)  and  (2.2b)  by 


Vi  ■  Brrqj-1(“/Vi)  (3.D 

5j%l  *  SIB  -  )pd-l 

where  «  -  s*rj  and  *j.,  -  pj.,*,.,. 

Let  R  *  KJ+1  \K^-1  and  S*  -  ^j  +  1  ^ 

_1  be  the  planes  for  which 
we  wish  to  generate  biorthogonal  bases.  Then 


Let  R 
by 


£rj*rj+l^ 


R  »  span{rj,?j+1}  CFn 
S*  -  span{sj,sj+1)  crJ  . 

then  the  pivot  matrix  W 


is  defined 


W  *  S*R 


(See  section  2.7,  Chapter  II.)  For  this  chapter,  W  is  assumed  to  be 
non-singular. 

Note  that  for  any  selection  of  biorthogonal  bases  R  *  [r,r+]  and 
§*  »  [s,s+]  in  R  and  S*,  respectively,  there  exist  invertible 


2x2  matrices  U  and  V  such  that 

R  =  RU 

*  (3.2) 

S*  =  V* s  . 

Further,  from  Chapter  II 

W  =  V"*$U  (3.3) 

where  ?  =  S*R  =  diag-Ctp^ 

Assume  that  8,  §*  is  the  pair  of  bases  which  is  produced  by  the 
look-ahead  procedure.  Recall  that  to  compare  this  choice  with  the 
biorthogonal  pair  of  vectors  produced  by  the  two-sided  Lanczos  process 
((2.4)  with  A  =  2),  we  must  calculate  the  cosines  between  the  possible 
bases  vectors. 

This  appears  to  be  extra  work,  since  it  seems  to  require  the 
generation  of  R  and  S*.  However,  by  utilizing  (3.2)  and  (3.3)  the 
cost  becomes  minimal  (see  the  algorithm  in  the  appendix  for  details). 
The  labor  involves  the  primary  vectors  for  the  subspaces  but  does  not 
require  6  and  §*  explicitly.  Further,  if  the  two-sided  Lanczos  is 
used  instead  of  the  2x2  pivot  scheme,  the  matrix-vector  products  in 
(3.1)  are  not  wasted  (again,  details  accompany  the  algorithm  in  the 
appendix). 

3.3  Pivot  factorization 

When  discussing  factorizations  of  matrices,  some  obvious  candi¬ 
dates  come  to  mind.  These  factorizations  are  exhibited  below  in  terms 
of  their  correspondence  to  bases  vectors  in  the  underlying  subspaces 
R  and  S*,  and  their  effect  on  the  J-matrix.  Here,  we  are  only 


concerned  with  the  directions  of  the  bases  vectors,  and  ignore  the 
effects  of  scaling. 


1.  LU  Factorization 


w  -  r 1  °if“  **>  1 

1 0/w  1  J  L  ur0^/u)J 


Here  both  the  row  and  column  Krylov  sequences  are  preserved.  Thus, 
locally,  the  J-matrix  is  both  upper  and  lower  Hessenberg. 

This  factorization  corresponds  to  two  successive  steps  of  the 
two-sided  Lanczos  process. 


2.  UL  Factorization 


p  8/a)i  r 
Lo  i  JL 


O)-02/u) 


Here  both  rj+1  and  s?+1  are  preserved  but  r^  and  s?  are  modi¬ 
fied.  This  is  equivalent  to  exchanging  2t+1  and  s^  and  r ^  with 
r.  in  the  Krylov  sequences.  We  can  preserve  a  mixed  symmetry  in  J 

J 

(|j^  A|  *  | k|)  with  appropriate  scaling,  but  the  Hessenberg  form  is 
lost. 


3.  OR  Factorization 


•1 1* o)  -el  f  9(u>+ui}l  -1 
[_  9  .  0  ox3  -  0^  J 


e2  +  o)2 


Here  the  Hessenberg  form  of  J  is  again  preserved.  The  row  space 
bases  vectors,  though,  are  both  in  so  that  a  definite  bump  above 

the  super-diagonal  is  created. 


4.  LU  with  Interchange 


=  [Ve  life  S  1 

w  L  1  °JL° 

The  factorization  corresponds  to  partial  pivoting  in  the  LU  factoriza¬ 
tion  of  W.  In  terms  of  the  residual  vectors  the  column  Krylov 
sequence  (q^SKJ,  q^efC^)  is  preserved  (thus  the  Hessenberg  form 
of  J)  but  in  the  row  Krylov  sequence  st  and  Sj+1  are  exchanged 
(thus  a  definite  bump  above  the  super-diagonal  of  J). 

5.  Spectral  Decomposition 

U  3  U*fU 

U*U  *  I  ,  f  3  d1ag{i|*i  ,ip2} 

(|>1  3  tj"(u)  +  <3  +  ( ((iJ-ti))^  +  48^)^^) 

$2  *  ^(oi  +  w-  ( (w-w)^  +  40^)^^) 

Here  the  row  vectors  both  come  from  /4+1  and  the  column  vectors  come 
from  K^+1.  Thus  the  J-matrix  bulges  on  both  sides  of  the  off-diagonal 
as  in  the  UL  factorization.  Also,  with  appropriate  scaling  mixed 
symmetry  is  preserved. 

3.4  The  angles  between  q^  and  p* 

We  digress  slightly  to  lay  the  foundations  for  a  useful  analytic 
tool.  Recall  (from  section  3.2)  that  R  and  S  (not  S*)  are  sub¬ 
spaces  of  Fn. 

In  [Davis  &  Kahan,  1970]  It  is  shown  to  be  proper  to  speak  of  the 
(two)  angles  between  R  and  5.  In  addition  to  the  well  known  minimum 
angle  between  a  vector  in  R  and  a  vector  in  S,  there  is  another 


well  defined  angle  which  has  to  be  included  in  a  full  assessment  of 
the  relationship  between  R  and  S.  These  angles  depend  only  on  R 
and  S  but,  nevertheless,  there  is  a  distinguished  pair  of  bases 
associated  with  them.  This  "angle  basis"  will  be  useful  in  our 
analysis. 

We  denote  this  basis  by  columns  of  ^  2  [q,q+]  for  R  and  the 
columns  of  P  =  [p,0+]  for  S.  (For  this  section  only  we  "transpose" 

and  P*  in  order  to  consider  R  and  S  subspaces  of  the  same  Fn.) 

The  matrices  Q  and  P  are  distinguished  by  four  properties: 

(i)  p*Q=i2 

(ii)  P*P=I2 

(iii)  Q*Q=I2 

(iv)  Z.(q,p)  *  min  £(r,s)  over  r  e  R  and  s  €  s 
When  Z.(q,p)  <  Z-(q+,p+)  then  P  and  Q  are  unique  to  within  ±.  For 

reasons  given  below  this  pair  of  bases  is  not  preferred  in  the  Look- 

Ahead  Lanczos  algorithm. 

We  note  in  passing  that  properties  (i)  and  (iv)  together  deter¬ 
mine  p+  and  q+  (provided  that  p  and  q  are  unique). 

PROOF.  The  vector  is  in  the  one  dimensional  subspace  of  R 
orthogonal  to  p.  Similarly  p+  Is  in  the  one  dimensional  subspace 
of  S  orthogonal  to  q  . 

3.5  The  angle  basis  and  the  SVD  of  P*Q 

Kahan  and  Davis  show  how  to  find  Q  and  P  from  any  pair  of 
orthogonal  bases  of  R  and  S.  Let  P*  and  Q  be  orthonormal  bases 
for  S*  and  R.  Then 


where  V  and  U  are  invertible  2*2  matrices. 


The  bases  P*  and  Q  are  then  found  as  follows: 

s  y*p* 

0  -  QO  (3'5) 

where  P*Q  3  VEIJ*  is  the  singular  value  decomposition  (SVD)  of  P*Q 
and  E  =  diagta^.ag}  is  the  matrix  of  the  cosines  of  the  angles  between 
R  and  S*. 

Of  practical  interest  is  the  fact  that  and  a2  can  be 
obtained  from  R  and  S*  without  forming  any  intermediate  vectors. 

This  follows  from  rearranging  (3.4)  and  substituting  in  (3.5)  to  get 

A  a4>  S1 

VEU*  3  V  wu  1  . 

So  the  angle  basis  comes  from  an  unobvious  factorization  of  W. 

3.6  Maximizing  bases 

We  now  wish  to  find  a  pair  of  bases  which  is  the  best  in  the  sense 
of  (2.11),  that  is,  the  bases  {p*,pj>  and  (q ,q+}  such  that 

maximum  minimum  C | p*q | , j P*q+ 1 > 

p,q 


is  attained.  The  maximum  can  be  determined  as  the  following  theorem 


THEOREM  3.1.  Let  ^(P*,d)  3  minimum  { |p*q| , |p*q+| >  where  P*  =  { p* , p 
and  (2.  *  (q»q+)  are  any  pair  of  biorthogonal  bases  for  S*  and  R, 
respectively  t  with  |p*l  =  lp*l  =  1  and  flql  =  Bq+ll  =  1.  Then 


2cla2 

maximum  ^(P,d)  3  — -r—  (=  harmonic  mean  of  <Jn  and  o0) 

P.Q.  al+a2  1  L 

where  _>  cig  0  are  the  cosines  of  the  angles  between  S*  and  R. 
Further , 

o«  <  maximum  ip ( P , Q.)  <  a.  . 
c  P.d;  p*p+=o 

PROOF.  Let  P*  and  Q  denote  the  angle  bases  of  section  3.4.  Let 

P*  a  [ffj]  and  Q  *  [q,q+] 

be  any  other  biorthogonal  bases  with  |p*l  *  lp*l  *  1  and  Iql  =  lq+l 
3  1.  Thus,  if  we  let  V  be  such  that 

p*  _  y*p* 

then  V*  *  jjjj]  for  some  pair  of  angles  0  and  ij/. 

Further,  to  preserve  biorthogonal ity.  If 

3  -  QU 


then  U  has  the  form 


^T^agCOSij;  -t^c^sine 


-1  J  -1 
-t.|  OjSinty  t2  al  cos  9 


2  2  2  2  2  22222 
where  t.  ■  o,sin  t|)  +  a2cos ip  and  t«  3  o^cos  0  +  a«sin  8.  Then 


P*Q  *  diagto^Sg}  with 

0^  *  Tl^a1a2C0S(9+'^  ,  02  =  T^o-^coste+il;)  . 

Define  5-j (0 »4»)  and  ?2(0,ij;)  by 


where 


Then 


5-|(0.i|>)  =  a^gCOsCe+tpJ/T^^) 
c2(8,^)  =  o^cos^-h^/t^o) 

2  2  2  2  2 
(x^(il)))  =  0-|Sin  1 p  +  02cos  ip 

(t2(0))2  *  a2cos20  +  02sin29  . 
^(P,(D  *  minimum{c1(0,i|;),?2(0,ij;)} 


for  appropriate  0,  i|/.  Our  problem  thus  becomes  finding 

maximum  minimum^, (e,40,i;,(6,ij>)}.  We  do  this  with  the  help  of  two 

0,*  * 

lemmas.  The  first  Isolates  stationary  points  on  level  curves. 


LEMMA  3.1.  For  fixed  fa  £.j(8,i|/)  has  a  relative  maxynum  at  0  =  - 

and  ^2 (® •'P )  has  a  relative  maximum  at  9  *  arctan(-— £tan  ip ) . 

°2 

Further  V*P*  is  orthonormal  at  8  *  -ip  and  QU  is  orthonormal  at 


°1 

0  »  arctan( — ^taniji). 


~  0.09sin(8+^) 

PROOF,  ^(e,*)  =  -  ■ - -  0  when  9  =  -ip 

a2 

— (9»<i0  *  -<r1a2/x1  (Tp)  <  o 

30  8*-$ 

3  p  2  3 

*  -0i02(a2sin  0  cos  tp  +  0-|COS  8  sin  i|;)/(t2(9)) 

2  2 

■  0  when  tan  9  a  -07/09  tan  ip 


*  _ala2^(a2COS  9  C0S  ^  "  alsin  9  sin 

30  2  2  3  -3 

+  (a2sin  e  cos  tjj  +  o^cos  6  sin  ^)-^C(t2(0))  ]] 

=  -a1a2(a1a2cos29/T2(6) +  a^a2sin20/T2(9) )/(x2(0) )3 

2  2 

when  tan  9  *  (-o^/a2)tan  if/ 

3  -(a1a2)2/(t2(9))4  <  0 


When  9  *  -if/ 


„*  s  [" cos  ip  -sin^l 
[_sini|>  cos  i|;J  * 

On 

Now  let  —  tan©  *--~tanij;  *  tan  a  for  some  a.  Then 
a1  a2 

sin  a  *  o2(sin  8)/t2(9)  *  -a^sin  ^)/x-1  (\i>) 
cos  a  *  a.|(cos  0)/t2(0)  ■  a2(cos  ilO/x^) 

so  that 

u  a  f  cos  a  sinal 
u  L-s1na  cos  aj  * 

The  second  lemma  establishes  the  point  where  if/  changes  from 
c-,(e»<lO  to  c2(e,i|;). 

LEMMA  3.2.  ^(9,^)  -  C2(0,<J>)  -  a^sinCZ^/Zofsin^  +  aicosz*  (3.6) 

when  0  *  if)  -t£. 

PROOF,  ^(©.(J/)  *  ?2(0.^)  when  t1  (^)  *  t2(0),  i.e. 

O  0  0  0  0  0  0 

a^sinZT|>  +  o^cos^ifi  =  a^cos  9  +  <?2sin  9 
or  j((o2+a|)  -  (a2-a2)cos  Zifi)  -  j((a^+a2)  +  (a2-o^ )cos  20) 


which  reduces  to 


cos  2\ jj  =  -cos  20 


which  occurs  when 

2ty  =  29  +tt 

or  i/>=9+^-i  (3.7) 

Substituting  (3.7)  into  s-|(9,tf>)  and  gives  the  result.  1 

We  now  have  two  lemmas  which  seem  unrelated  to  the  problem. 
However,  for  a  fixed  iji,  max(¥)  must  occur  either  at  a  relative 
maxima  or  when  ^  *  z That  is,  for  fixed  i|>,  either 


or 


¥  =  £ 2  ^or  0 


¥  = 


?2  when  TjU»)  <  t2(9) 
C1  when  x1  (ip)  >  t2(9) 


The  symmetry  of  the  properties  of  the  two  planes  R  and  S* 
means  that  for  orthogonal  bases,  only  c2(0,i|;)  with  0  =  need  be 
maximized  along  with  maximizing  (3.6). 

When  0  =  -ip 


a2  ~  *  cr^ag/Zafcos2^  +  o^sin^  £  0^  (3.8) 

However,  when  C2(-<|>,i|0  >  ¥  *  C-j  (-^h  so  that  the 

cross-over  point  from  ?2  t0  5-|  1S  of  interest.  Thus,  we  are  again 
concerned  with  maximizing  (3.6).  (Note  that  (3.8)  constitutes  the 
second  part  of  the  theorem.) 

From  (3.6)  we  have 
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with  the  upper  bound  achieved  only  when  ag— *0.  Further,  from 
Theorem  3.1,  for  P*  orthonormal 

<?2  1  |p*q|»|pjto+l  £ 

Thus,  by  forcing  P*  to  be  orthonormal,  f  will  never  be  less  than 
half  its  maximum  possible  value.  Further, 

CLAIM.  For  fixed  q  £  R,  S*q  f  0,  there  are  p*  and  p*  G  S*  and 
q+  €  R_,  such  that 

P*Q  is  diagonal 
and  P*P  *  I  . 

PROOF.  Let  q  e  R.  Let  span{p1,p2)  »  S*  with  Bp*B  *  Ip*B  =  1, 
P^q/0.  Let 

p;  -  Pj  -  (p|q/P;q)p*  . 

Then 

P l  -  P*/Bp*i 

p*  -  (p;-(p,*p+)p;)/!i-(p;p+)2)1/2 

Let  q  €  R,  q  f  q,  then 

q+  ■  q  -  (p*q/p*q)q  . 

If  q  is  set  as  r^/Br^S  and  P*  is  orthogonal,  the  J-matrix 
remains  Hessenberg  while  the  cosine  of  the  maximum  angle  is  no  worse 
than  half  the  optimum. 


4! 


3.8  Residual  bounds 

For  symmetric  matrices,  the  Rayleigh-Ritz  procedure  gives  the 
best  approximation  to  eigenvalues  when  approximate  eigenvectors  are 
known,  and  this  theory  has  been  exploited  in  the  symmetric  Lanczos  pro¬ 
cess  (Parlett  [1980]).  Kahan,  Parlett  and  Jiang  ([1981])  approached 
the  non-symmetric  case  and  produced  residual  bounds  to  measure  conver¬ 
gence  in  the  sense  of  backwards  error  analysis. 

We  summarize  the  results  below  and  extend  them  to  handle  the  look¬ 
ahead  procedure.  The  terminology  established  for  the  symmetric  case, 
though  not  precisely  correct  in  the  non-symmetric  case,  is  used.  Thus 
"Ritz  value"  denotes  an  eigenvalue  of  the  J-matrix,  and  "Ritz  vector " 
corresponds  to  a  particular  approximation  to  an  eigenvector  of  B. 

It  is  important  to  note  that  for  scalar  0  and  vectors  x  and 
y*  we  are  not  producing  a  bound  on  |A(B)-0|  as  can  be  done  in  the 
symmetric  case,  but  a  lower  bound  on  IB-Bl  where  n*n  B  has 
(0,x,y*)  as  an  eigentriple  [(a,z,w*)  is  called  an  eigentriple  of  C 
if  Cz  ■  za  and  w*C  *  aw*].  Thus  we  assess  the  convergence  of 
(0,x,y*)  to  eigentriples  of  B  in  terms  of  the  deformation  needed  to 
make  them  exact. 

The  main  results  of  the  Kahan,  Parlett  and  Jiang  paper  (KPJ)  will 
be  presented  without  proof,  beginning  with  the  main  theorem  of  their 
work. 

THEOREM  3.2  (Kahan,  Parlett  and  Jiang).  Let  n*n  B  and  nxm  ortho¬ 
normal  P  and  Q  be  given.  For  any  m  x  m  D  let 

R  3  BQ  -  QC 
S*  3  P*B  -  DP* 


C  =  (P*Q)"'D(P*Q) 


Zn  =  P*(PQ-QC)  =  (P*B-DP*)Q 


Then  there  exist  solutions  E  of 


(B-E)Q  =  QC  and  P*(B-E)  =  DP* 


with  minimal  norms;  some  with 


DEI  =  min  IEH  =  max{BRll JS*B} 
E 


and  others  with 


IEOf  =  min  IEIp  =  (BRBp + BS*l|  -  BZn lp)1/2  . 

Let  (0,z,w*)  be  an  eigentriple  of  J . ,  the  J-matrix  from  the 

J 

jth  step  of  the  look-ahead  procedure.  Let  and 


Pj  =  :  ,  then  the  "Ritz  vectors"  x  and  y  are  defined  by 


x  3  QjZ 
y*  =  w*P* 

J 


Assume  that  w*z  »  1  and  P^Q..  *  Ij  so  that  y*x  3  1.  Then 

COROLLARY  3.1  (KPJ).  The  closest  matrix  to  B  with  [Q,x*y*)  as  an 
eigentriple ,  is  B-E  for  E  satisfying 


B  El  =  max 


t  l  x  I  *  ly*f J 


BQj ~ QjJj  3  [0, . . . » 0 • q j  +i ] 
P4B-J4P4  *  [0,  .  .  .  ] 


where 


£.  t8  the  last  element  of  Z 
J 


the  last  element  of  w  and  8j+-|Yj+i  =  S j+i  rj+-j 


COROLLARY  3.2  (KPJ).  Let  (9,z,w*)  he  an  eigentriple  of  with 
w*z  =  1.  Then  for  all  k  >  j,  (9,Z,w*)  is  an  eigentriple  of  J^-6 


and  with 


e.  the  i  column  of  I j 

ak  -  '-ur^j+i2* +  l“N*ri)Sej+i 


Moreover , 


IGkl  *  max 


,Sk'F  lw.,2 


and  both  norms  are  independent  of  k. 


The  proof  of  the  above  corollary  does  not  require  strict  use  of 
the  Lanczos  process,  but  requires  that  the  jth  step  be  done  using  ? 

1  xi  pivot.  Further,  we  can  modify  the  proofs  to  handle  the  2x2 
pivot. 


3.9  Residual  bounds  with  2x2  pivots 


The  key  to  generalizing  Corollaries  3.1  and  3.2  is  to  remember 
the  block  tridiagonal  structure  of  the  J-matrix  (section  2.8)  so  that 
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V 


A1  r2 
B2  A2 


•  r 

•  •  A  0 

B£  A* 


(recall  that  the  i  in  A.  refers  to  the  step  count  and  is  indepen- 

J 

dent  of  the  pivot  size  used).  Let  3  C?j+1 ,6j+l^* 

ri+.j  3  [<$j+i  »Vj+i ]*  be  the  additions  to  the  J-matrix  for  a  1  *1  pivot. 
Then  by  putting  Bi+1  in  place  of  Bj+1  and  r^+1  in  place  of  y^-j 
in  Corollary  3.1,  we  get 


COROLLARY  3.3.  The  closest  matrix  to  B  with  (8,x,y*)  as  an  eigen- 
triple  is  B-E  satisfying 


where  BQj  -  Q^Jj  =  ^j+1  *^*^j+l ’Bj+i  ^  (3.9a) 

PjB-JjP}  -  (Pj+1(0 . 5j+1  *Vl)}*  (3*9b) 

W  ■  (oiq  » .  • .  ,U3j  ) »  Z  * 

and  x  3  QjZ  ,  y*  »  w*P?  . 

PROOF.  The  equations  (3.9a)  and  (3.9b)  drop  out  of  the  Look-Ahead 
Lanczos  algorithm.  We  may  post  multiply  (3.9a)  by  any  vector  in  F^. 
The  most  useful  choice  Is  the  eigenvector  z  associated  with  the  Ritz 
value  6,  so  that 


Bx  -  x6  3  BQjZ  -  QjZ9 

*  BQjZ  -  QjJjZ  3  qj  (0, .  •  •  ,0,Cj_^i  )z 


Similarly 


y*B  -  0y*  = 


=  w*P?B  -  0w*Pt 
J  J 

=  w*P?B  -  W*J^Pt 

J  J  J 

r°  i 


By  applying  Theorem  3.2  with  m=l,  the  result  follows.  ■ 

The  trick  of  replacing  by  Ba+i  and  y.+i  by  in  the 

KPJ  proof.  Corollary  3.1  goes  over  to  Corollary  3.3  in  a  straight¬ 
forward  way.  Similarly,  Corollary  3.2  generalizes  to 


COROLLARY  3.4.  Let  (0,z,w*)  be  an  eigentriple  of  J.  with  w*z 

J 

Then  for  all  k  >  j,  (0,2,W*)  is  an  eigentriple  of  J^-G^  and 


(with 


Cth  e^  the  i  column  of  IjJ 


G„  -  ♦  (  J"  \j;'  tl|B 


JH  J  J+1  J=J)ge*  . 
Iw*l  ;wej+l  * 


Moreover t 


■Vf 


6 j+i cj+pj+i cj-i '  ' y j+lV6 j+i^i-i 


l&j-HVplHhlci-ll2  +  Ki+l  I2 


and  the  norm  are  independent  of  k. 


Though  at  present  we  have  restricted  ourselves  to  the  2x2  case, 
the  generalization  holds  for  any  pivot  size.  The  replacement  of 
and  Yj+1  by  Bj+1  and  r^+1,  respectively,  is  independent  of  the 
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length  of  Bj+-j  and  r^.  Note  that  the  residuals  remain  of  rank 
one. 

So  now  we  can,  in  principle,  test  all  the  Ritz  values,  0^,  at 
each  step  and  determine  which  are  acceptable  in  the  sense  of  being 
eigenvalues  of  matrices  close  to  B. 

3.10  Summary 

We  now  have  an  understanding  of  the  effects  of  various  factoriza¬ 
tions  of  the  2x2  pivot,  just  as  we  saw  the  effects  of  various  factori¬ 
zations  of  the  1  xi  pivot  in  Chapter  I.  Further,  we  have  seen  that  the 
natural  bases,  the  angle  bases,  are  not  the  most  desirable  either  for 
keeping  J  sparse  or  to  keep  £(q.,p*)  minimal. 

Moreover,  we  determined  just  how  much  can  be  gained  from  any 
factorization  and  how  to  weigh  this  against  a  more  convenient  structure 
for  the  J-matrix.  The  increase  of  maximum  min{ |p*q| , |p*q+| }  by  a 
factor  of  2  may  be  the  difference  between  continuing  the  Look-Ahead 
Lanczos  process  and  admitting  failure.  However,  without  a  convenient 
method  for  solving  the  eigenvalue  problem  for  non-Hessenberg  J,  what 
was  gained  with  the  optimum  basis  is  lost  converting  J  to  Hessenberg 
form. 

Finally,  the  assessment  of  convergence  of  eigenelements  of  J  to 
eigenelements  of  B  has  also  been  discussed.  Though  seemingly  out  of 
place  in  this  chapter,  this  presentation  completes  the  material  neces¬ 
sary  for  producing  a  working  (though  not  necessarily  efficient) 
procedure.  As  noted,  the  residual  bound  calculations  may  be  performed 
for  every  Ritz  value  at  each  step,  and  convergence  of  appropriate  eigen¬ 
elements  can  be  determined. 


5 


IV.  Serious  Breakdown 

4.1  Introduction 

We  have  presented  Lanczos  without  serious  breakdown,  and  the 
Look-Ahead  Lanczos  for  some  cases  of  serious  breakdown.  However,  there 
are  cases  where  the  Look-Ahead  Lanczos  process  cannot  succeed,  no  matter 
how  large  the  pivot  we  use.  This  form  of  breakdown  we  call  "incurable". 

Incurable  breakdown  at  first  glance  seems  a  disaster.  We  are  in 
possession  of  non-zero  residuals  which  are  mutually  orthogonal,  for 
which  there  is  no  foreward  looking  remedy.  We  will  show,  however, 
that  incurable  breakdown  is  a  blessing  peculiarly  related  to  the 
encountering  of  a  zero  residual. 

To  complete  the  discussion  of  the  Look-Ahead  Lanczos  algorithm, 
we  present  a  characterization  motivated  by  the  foregoing  analysis  of 
breakdown  for  which  the  look-ahead  algorithm  is  successful.  This 
characterization  rounds  out  the  analysis  of  the  look-ahead. 


4.2  Invariant  subspaces 


Suppose  by  some  special  relation  of  the  starting  vectors,  that 
the  Krylov  subspaces  become  Invariant  before  the  nth  step.  Say,  let 

Ic  j)  . 

K (q,B)  and  K*(p  ,B)  be  invariant  subspaces  with  k,  l  <  n.  Note 


that  in  the  discussion  of  Incurable  breakdown,  we  may  disregard  the 


case  of  k*n  or  of  ian,  since  Chapter  II  shows  that  such  break 


down  Is  impossible. 

Define  the  row  and  column  generalized  eigenvectors  of  B,  w*  and 
Zj,  respectively,  so  that  w*z^  »  5^.  Then  it  follows  from  the 
Jordan  form  of  B  that  for  some  i-],...,1k  and  j^,...,jk  that 


We  may  assume  that 


-  ■  i,vv  . k  (4-ls 

i  * 

and  p*  l  bw.  b  ^0,  m=l,...,Jl  (4.1b 

m-1  m  Jm  m 

k  (l) 

since  such  vectors  exist  in  K  and  K;  ',  respectively,  and  may  be 

used  to  generate  the  Krylov  subspaces. 

Let  s*  and  r  be  the  row  and  column  residual  vectors, 
m  m 

respectively,  at  step  m.  Then  incurable  breakdown  at  step  m  is 
defined  by 

s;>*0* 

Tm*  0 

and  s*B^r  =  0  ,  j  >  0  . 

mm  — 

Thus  incurable  breakdown  occurs  when  s*  1  K  (or  equivalently 

r  1  fC).  Note  that  incurable  breakdown  must  occur  at  step  m  < 
m 

minU.k},  since  at  step  j  =  m1n{Jl,k}  one  residual  is  zero,  which  is 
not  breakdown. 

4.3  The  moment  matrix  and  incurable  breakdown 

Eventually  we  will  link  incurable  breakdown  to  the  eigenexpansions 
(4.1a)  and  (4.1b).  We  will  accomplish  this  in  steps,  the  first 
relating  Incurable  breakdown  to  the  rank  of  the  moment  matrix. 


5 


LEMMA  4.1.  Let  Mn  be  the  nxn  moment  matrix  generated  by  p*,  q 
and  B.  Let  be  the  row  Krylov  matrix  and  the  column  Krylov 

matrix.  Then  insurable  breakdown  occurs  if  and  only  if 


rank(Mn)  <  min{rank(K*),rank(Kn)} 


=  min{dim(K£),dim(Kn)} 


PROOF.  Sufficiency:  Assume  that  the  Look-Ahead  Lanczos  algorithm 
suffers  incurable  breakdown  at  step  m.  Since  the  look-ahead  algorithm 
is  a  modified  two-sided  Gram-Schmidt  process,  it  is  equivalent  to 
making  elementary  matrix  operations  on  K*  and  Kn*  Let  U  and  V* 
be  the  matrices  which  perform  these  operations  so  that 


e*V*K* 
J  n 


J  <m 

J  i  m  »  K  lie  - 

s*BJ"m  j  >m  n  3 
m  — 


dj  j  <m 

BJ'mr  j  >m 
m  ~ 


where  e^  is  the  jtn  column  of  In* 


Recall  that  incurable  breakdown  means  sB  r  «  0  for  i  >  0. 

mm  — 


Hence, 


V*K*K„U  -  V*M  U 
n  n  n 


>.-iL  1 

-  0  ’ 


.  '.Vi! 

L  r  o 


Since  U  and  V*  are  invertible 


Rank(Mn)  *  rank(V*MnU)  *  m-1  . 


To  complete  this  part  of  the  proof,  note  that 
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r^O  implies  dim(Kn)  =  rank(Kn)  s  rank(KnU)  >_  m 

and  s*  f  0  implies  dim(/c2)  =  rank( K*)  =  rank(V*K*)  >  m 

m  n  n  — 

Necessity:  Let  m-1  =  rank(Mn)  =  rank(Kn)  <  rank(K*)  (say). 

Let  U  and  V*  be  as  above.  We  need  only  show  r  =0  to  complete 

m 

*  Pl 

the  proof.  Let  Qm_1  =  Cq1 *qm_l ] *  pm-l  =  i  *  There  is  no 

LPm-J 

loss  in  generality  in  assuming  P*_-jQm_i  *  I^-j.  Rank(Kn)  *  m-1 

implies  r  *  Qm  ,c  for  some  vector  c.  Now 
m  m- 1 

0  =  Pm  ,r  =  p*  iQm  ic  =  c  .  * 

m-l  m  m-i  m-i 

EXAMPLE  4.1  (Incurable  breakdown).  Let 


0  0  0  1  " 
10  0  0 
0  10  0 
0  0  1  0_ 


*  [1 .1,0,0] 


'10  10 
0  10  1 
10  10 
0  10  1 


110  0 
10  0  1 
0  0  11 
0  110 


so  that 


1111 
1111 
1  1  1  1  j 

1  1  1  1  J 


Note  that  p<j q-j  »  p  q  *  1.  From  (1.3) 


and  s£  =  p*B  -  ap*  =  [0,-1 ,0,1]  . 

Further,  s£B^r2  =  0,  j  >_  0,  so  we  have  incurable  breakdown.  Also 
note 

rank(M4)  *  1  ,  rank(K4)  =  2  ,  rank(K^)  =  3  . 


r2  =  Bq  -  qa  = 


-1 

1 

-1 

1 


4.4  The  mismatch  theorem 

The  previous  lemma  characterizes  incurable  breakdown  in  terms  of 
the  rank  of  the  moment  matrix.  Here  we  give  a  more  illuminating 
explanation. 


THEOREM  4.1  (Mismatch  Theorem).  Let  p,  q  and  B  be  given.  Let 
Kk  *  span{q,Bq,...,Bk_1q}  and  K*  *  span{p*,p*B,..,,p*Bj!'"1}  be 
invariant  subs paces  of  dimension  k  and  l,  respectively .  Then 
incurable  breakdown  occurs  at  step  i  if  and  only  if  there  are 
generalized  row  eigenvectors  {w^, . . . ,w*,w*+-| , . . . ,w*}  and  generalized 
column  eigenvectors  {z^ ,...,z^ .Z^ ,...,Zk+i_^}  with 


such  that 


L> 


(4.2) 


span{z 


i 

t 


Condition  (4.2)  departs  from  the  ordering  which  produces  the  Jordan 
canonical  form  for  Jordan  blocks,  i.e. 


Bzj  *  y+zj+i 

w*B  =  Aw*  +  w*  . 
J  J  J"  ' 


There  is  no  loss  in  generality  in  assuming  (4.2)  and  reduces  the 
complications  in  subscripts. 


PROOF  OF  THEOREM.  Sufficiency:  Consider  and  K*.  Since  Kk 
is  invariant 

rank(Kn)  *  dim(Kn)  *  dim(Kk)  3  k 

Similarly 


rank(K*)  3  dim(K*)  «  dim(K^)  3  l  . 
k  2. 

Thus  using  the  Invariance  of  K  and  K*.  there  are  invertible  X 
and  Y  such  that 


and 


KnX  3  Cz-j .  —  »z1  ,  —  *z|c+jt-i  »°*  — 


x  [z.z.O]  ,  z»z1 . Zjj  z3zi+] 


‘k+i-1 


YK* 


w 


w: 


i+1 


w! 


r  w* 


w  3  w^,...,w^; 

W  3  ,...  ,Wj| 


0 


Thus, 


W  ■ 


W  2  W  Z  0  ] 

A  ,  «#  A  i  A  j 

w  z  w  z  0 


r  i4  0 

I  0  0 


by  (4.2) 


So,  rank(Mn)  =  ranktK*!^)  =  rank(Y*ft*KnX)  =  i  <  k,£.  Thus,  by  Lemma 
4.1,  we  have  Incurable  breakdown. 

Neaeseity:  Assume  that  there  is  no  mismatch,  i  *  k  *  min(k,2,) 


(say).  Then 


K1  *  span{z.j .... ,Zj} 


rank(Kn)  *  dimCk”)  =  dim(K1)  3  i  =  rank(Mn)  . 


Thus  by  Lemma  4.2,  there  is  no  incurable  breakdown. 


EXAMPLE  4.2.  Let  B,  p*  and  q  be  as  in  Example  4.1.  The  matrix  B 
is  normal  so  w*  ■  z^  with  eigenvalues  iJ,  j*l,...,4,  i  *  v^T  and 
eigenvectors 


In  this  case 


P*  *  zl+z3  +  z4 


q  ■  22 +  z4 


<2  ■  span{r^,2j,zJ}  1C  -  span{z2>z,} 
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4.5  Ritz  values  and  incurable  breakdown 

For  discussion  of  the  Ritz  values  of  the  matrix  J.,  let  us 
assume,  temporarily,  that  B  has  simple  roots,  so  that  w*B  = 
and  Bz.  =  z.X.  for  all  i  and  j.  The  case  of  defective  matrices, 

J  J  J 

though  not  unlike  the  non-defective  case,  is  somewhat  more  complicated 
and  its  discussion  is  postponed  until  the  next  section. 

What  the  Mismatch  Theorem  has  given  us  is  that  in  the  case  of 
incurable  breakdown 


where 


*+  l  bjwJ 

j=i+l  3  3 

kn-i 

+  l  a.z. 
j=A+l  J  3 


P*  -  I  b.w* 
j=l  3  3 

5s  I  ajZj 
j=l  3  3 


(4.3) 


with  a.  f  0,  j  *  l,...,i,A+l,k+Jl-i;  b,  f  0,  m  * 
j  m 

Then  any  element  of  the  moment  matrix  Mn(p,q,B)  has  the  form 


k+A-i 


P  Bq  =  (p  +  l  b.w.)B  (q+  l  a.z.) 

j=1+l  J  3  j*£+l  3  3 

k+z-i  a 

•  pVij  +  ?*(  I  aiBmz .)  *  (  l  b.w*Bm)q 
1  J  3  j»i+l  3  3 

o  If +2.-1 

+  (  I  b  w*)(  i  a . Bmz . ) 

j*i+l  3  3  j*2,+l  3  3 

■  p*Bmq  +  +  (  I  X'jb.wpq 

j-Ul  J  3  3  j-1+1  J  J  J 


i  k+H-i 

+  (  l  b^wpt  l  x’Ta.z. 
j-1+1  3  3  j-i+1  3  3  3 
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k+A-i  m 

I  A?*Zj 

j=£+l  J  J 


=  p*Bmq 


+  I 

j=i+l 

Jl+k-i 


Thus  we  get  the  following  lemma: 

LEMMA  4.2.  Let  p*,  p*3  q  and  q  be  defined  1 by  ( 4.3 ),  then 

Mn(P*,q,B)  =  Mn(p*,q,B)  . 

Me  can  now  show  that  incurable  breakdown  is  not  a  misfortune  as 
the  following  surprising  result  shows. 


THEOREM  4.2.  Let  B  have  distinct  eigenvalues  and  let  J.  be  the 
block  tridiagonal  matrix  produced  by  the  Look-Ahead  Lanczos  process 
at  step  i,  with  p*  and  q  starting  vectors .  If  incurable  breakdown 
occurs  at  step  i+1  then  each  Ritz  value  of  is  an  eigenvalue  of  B. 

PROOF.  By  the  Mismatch  Theorem  p*  and  q  have  the  form  (4.3). 
Consider  now  the  Look-Ahead  Lanczos  with  the  p*  and  q  defined  by 
(4.3).  The  subspaces  <i(p*,B)  and  K^q.B)  are  invariant,  so  that 
each  Ritz  value  of  the  J-matrix  generated  using  p*  and  §  is  an 
eigenvalue  of  B.  By  using  Lemma  4.2  and  Lemma  1.4  (Chapter  I)  the 
result  follows.  ■ 


EXAMPLE  4.4.  Let  p*,  q  and  B  be  as  in  Examples  4.1  and  4.2.  Then 

J1  -  X(J-j)  ■  o1  ■  p*Bq  *  1  *  X4(B)  . 


4.6  Defective  matrices 


By  their  bi orthogonality,  the  generalized  eigenvectors  of  the 
expansions  in  (4.3)  influence  only  those  components  associated  with 
its  Jordan  block,  thus,  the  interaction  within  a  single  Jordan  block 
to  generalize  Theorem  4.2. 

Therefore,  let  B  be  a  Jordan  block  of  grade  n,  that  is  nxn  B 
has  the  form 


X  1 
X  1 

Q  _  X*. 

D  ~  •  •  • 

M 

x  _ 

Then 

k+£  n 

LEMMA  4.3.  Let  p*  *  Y  b_.e?  and  q  *  7  a.e.  where  e.  is  the 

1*1  1  1  i*k+l  1  1  1 

\n  column  of  In  and  a<  7*0  and  b^  f  0.  Then  the  J- matrix 

generated  by  p*,  q  and  B  is  similar  to  a  Jordan  block  of  degree  l 


For  simplicity  we  make  the  following  notational  convention.  Let 


ell  c12  "•  cln 

•  •  • 

•  •  • 

•  •  • 

cnl  cn2  ***  cnn - 


and  define 


cn,j) 


cii  cii+l 
•  • 

•  • 

•  • 

-  cji  cji+l 


The  proof  of  Lemma  4.3  is  simplified  by  the  following  technical  lemma 


LEW1A  4.4  (B)^  jj  *  N.+j+j  where  is  the  k*k  Jordan  block. 

Further, 


18  *  (nhm>'  • 


kth 


PROOF.  The  (k,£)u‘  element  of  Bm  = 


k>  l 


Similarly  the  (k,£)th  element  of  .+j  =  < 


I l>i 


(Sjjr1"  £  =  £+m,  m  >  0 
m  — 


PROOF  OF  LEMMA  4.3.  Let  p  = 
k 


k+Jl 


7  b.e.  and  q  =  T  a^  so  that 
i=k+l  1  1  i=k+’  1  1 


k+£ 
(+1 


p*  ■  j  b.et  +  p*  and  q  a  q+  £  a.e?.  Consider  any  element  of  the 

i=l  1  1  i=k+£+l  1  1 


moment  matrix 


p*BJq  =  (p*+  l  b1et)Bj(q+  l  a.ej  . 

i*l  1  1  1=k+£+l  1  1 


Using  an  argument  similar  to  that  of  the  proof  of  Lemma  4.2  we  have 
p*Bmq  =  p*Bj3 


where  e^  is  the  i**1  column  of  1^.  Thus  the  moment  matrix  generated 

by  p*»  q  and  B  is  the  same  as  that  generated  by  p*,  q  and 

with  <^(p*,N^)  *  F*  and  *  FA.  Thus,  the  J-matrix  generated 

by  p*,  q  and  B  Is  the  same  as  that  generated  by  p*,  q  and  N^, 

the  latter  J-matrix  being  similar  to  Ng.  ■ 
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Thus  we  have  the  following  extension  of  Theorem  4.2. 

THEOREM  4.3.  Let  B  be  non-derogatory  (i.e.  each  eigenvalue  of  B 
is  associated  with  only  one  Jordan  block)  and  let  J..  be  the  J -matrix 
generated  at  step  i  of  the  Look-Ahead  Lanczoa .  If  incurable  breakdown 
occurs  at  step  1+1,  then  each  Ritz  value  of  J.  is  an  eigenvalue  of  B. 

4.7  Curable  breakdown 

We  now  have  a  characterization  of  incurable  breakdown  in  terms  of 
the  row  and  column  eigenvector  expansions  of  the  starting  vectors. 

Such  a  characterization  is  also  possible  with  curable  breakdown 
(breakdown  which  the  Look-Ahead  algorithm  with  a  suitable  pivot 
circumvent).  We  start  by  defining  curable  breakdown  of  degree  i. 

Let  diagonal izable  B  and  vectors  p*  and  q  be  given.  Let 
s*+1  and  rk+.|  be  the  residual  vectors  after  the  kth  step  of  the 
Look-Ahead  Lanczos.  Then  we  say  the  look-ahead  process  suffers 
curable  breakdown  of  degree  l  at  step  k  if 

sk+1Bmrk+1  -  0  ms  0,. (4.4a) 

sk+/rk+i  * 0  <«•«>> 

With  this  breakdown,  the  Uxi)x(**i)  pivot  matrix,  X,  is  Hankel 
matrix  of  the  form 

"  0  •••  0  E 
•  * 

x «  :  : 

o 

^  ★  •  •  •  ♦ 


6 


where  £  =  t  0  and  *  denotes  a  possibly  zero  element. 

The  point  here  is  that  X“^  exists  so  that  the  look-ahead  process  can 
continue. 

Let 

n  n 

p*  =  l  b<wj  and  q  =  l  a.z.  (4.5) 

i=1  11  i=l  1  1 

where  each  (A^w*^)  is  an  eigentriple  of  B.  Then  using  (1.5), 
(4.4a)  becomes 

0  “  sk+lB",rk+l  *  (Y(k))‘1P*Xk(B)Bmxk(B)q(B(k))‘1 

*  (o)(k))'1  l  l  biw*Bm(xk(B))2ajZ- 

1*1  j=l  1  1  K  J  J 

*  t1I1X1l(xk^1^2j^1ajbiwiZj)/ta)lk) 

-(l(«k(xl))2^lbi)/«(k) 


where  0^,  and  u>^  are  as  in  Chapter  I.  If  we  let 


x  *  (x-|,...»xn)  with  x1  *  aibi 


(4.6) 


then 


Sk+1B\  * 


-  (Af . xJ|Mkx  ,  m-0 . A- 1  (4.7) 

where  Ak  *  diag{(xk(X1))2,...,(xk(Xn))2}.  So  (4.4a)  in  matrix  form 


becomes 


Vkx  ■ 0 


with  the  ixn  Vandermonde  matrix 


Further  (4.4b)  is 


skB*rk+l  =  (X1 . Xn)Akx  *  0  * 

So 

THEOREM  4.4.  Let  p*  and  q  be  as  in  (4,5)  and  non-defective  B  be 
given •  Then  curable  breakdown  of  degree  l  occurs  at  step  k  if  and 
only  if 

A^x  €  N(V^)  (the  nullspace  of 
but  AkX  $  W(Vi+1) 

where  x  ■  (x^,...,xn)  as  defined  in  (4.6)3  Ak  *  diag{ ))  *---» 
(x^ (^-n ) )  Vm  is  the  m*n  Vandermonde  matrix  for  m  ■  £.,£+1. 

The  above  characterization  Is  not  entirely  satisfying.  We  cannot 
escape  the  dependence  of  A^  on  x  and  V^.  But  we  can  see  that  the 
likelihood  of  selecting  p*  and  q  generating  such  an  x  decreases 
with  the  Increase  In  the  degree  of  curable  breakdown. 

EXAMPLE  4.3.  Consider 


— i 

o 

o 

o 

J 

‘  1  ‘ 

10  0  0 

0 

B  * 

0  10  0 

.  P*  ■  [1 ,0,0,0],  q  = 

0 

.0010. 

_  0  . 

Then 


0 

1 

0 

0 


and  s£  =  [0,0, 0,1]  , 


s*Br2 


[0,0,1 ,0] 


0 

1 

0 

0 


=  0  , 


and 


s2Br2 


=  [0,1 ,0,0] 


0 

1 

0 

0 


=  1 


Here 


4 

K 
1*1  1 


•f 

* 


x  *  [.25, .25, .25, .25]  , 


“1  1 

1  1  * 

'  .25" 

m 

1  -1 

1  -1 

(1  «/T), 

V  * 

.25 

_  1  1  - 

1  -1 

-.25 

■ 

.  -*25  . 

and 


V^x  -  0  ,  V3A1x  *  1  . 


4.8  Curable  breakdown  and  defective  matrices 

To  handle  the  case  of  defective  matrices  (a  matrix  Is  defective 
If  It  has  at  least  one  Jordan  block  of  grade  >  1),  we  may  again 
confine  ourselves  to  a  single  Jordan  block.  So  let  n*n  B  be  of  the 
form 


B 


(4.9) 


i 


Let  p  and  q  be  as  In  (4.5).  Let  sir  ,  * 
n 

Vi  *  so  that 


b.-  = 


min(k+i-ltn)  .j-i 

l  bi(^TTXk(t 

j=i  1  dtJ  1  K 


1 


i-J 


ai 


j*max(l,k-i+l)  1  dtJ  1  K 


Consider  from  (4.4a) 


Sk+1B  rk+l 


n 


(  l  l  a.2i) 

1*1  11  1*1  1 


n  ^  n  i 

(  l  biwp(  l  z.  I 

1*1  1  1  1*1  1j*max(l ,m-i+l ) 


Rearranging  we  get 


where 


Thus  (4.4a)  becomes 


V  - 0 

where  V.  Is  the  generalized  Vandemonde  matrix 


si 

i 

;'#<• 

\t'^f 

I 


N 


i 

. , 
/: 
w.# 


^  l,',» ■  *,  T TT  Vf  A  *  -  V  *  •  ’ '  -  ' *"  s"-" ,V  '■ 


'  1  0  . 

X  1  0  ..... 

.  x4"1  (*"' 1  )x*"2  ...  (^V*'1  •••  1 


0 

0 


0 


so  that 

THEOREM  4.5.  Let  p*  and  q  be  defined  by  (4.5)  and  B  by  (4.8). 

A  A  /V 

Let  b..  and  be  defined  by  (4.10)  and  x  by  (4.11).  Then  curable 
breakdown  of  degree  1  occurs  at  step  k  if  and  only  if 


x  €  W(V£) 

X  $  N(Vi+1 )  . 

PROOF.  The  proof  Is  completed  by  showing  x  ^  M(9^).  From  (4.4b) 
0  *  sk+lB*rk+l 

-  . . i.o . 0]S 

*  e«,+l^t+l* 

where  e4+1  Is  the  (£+l)st  cdlumn  of  In. 


B  ' 


k. 


> 


h 


4.9  Sunmary 

We  now  have  the  characterizations  for  serious  breakdown  in  terms 
of  the  eigensystems.  Further,  we  have  that  one  form  of  serious  break¬ 
down  can  be  remedied  and  that  the  other  form  is  fortuitous  in  the 
search  for  eigenvalues. 

Though  the  result  of  Theorem  4.2  Is  counter-intuitive  at  first. 

It  becomes  more  tangible  when  we  consider  the  case  of  only  one  residual 
vector  becoming  zero.  The  other  residual  vector  does  not  interfere 


A 


with  one  important  feature  of  the  J-matrix  (that  is,  each  Ritz  value 
of  J  being  an  eigenvalue  of  B).  We  only  lose  one  set  of  eigenvectors 

In  the  case  of  incurable  breakdown  we  preserve  the  relationship 
between  Ritz  values  of  J  and  some  of  the  eigenvalues  of  B,  but 
cannot  extract  either  row  or  column  eigenvectors  directly  from  the 
subspaces  generated. 

Finally,  we  can  characterize  curable  breakdown,  from  which  we  see 
that  the  curable  breakdown  of  degree  k  becomes  less  likely  as  k 
increases.  Thus  the  restrictions  on  pivot  size  due  to  practical 
considerations  such  as  storage  do  not  unduly  restrict  the  effectiveness 
of  the  algorithm. 


The  algorithm  below  leaves  the  2*2  pivot  factorization  (i.e.  U 
and  V*)  arbitrary.  Recall  (section  2.8)  that  the  step  i  corresponds 
to  the  number  of  pivots  (1  x  l  or  2*2)  used  and  £  corresponds  to  the 
number  of  bases  vectors  generated. 


Action  0: 


(Collect  and  evaluate  data  from  the  previous  stpe,  i-1) 


On  hand  are  r^,  s*.  Ir^l,  ls*l,  P*_r  Qi_],  zj,  Z*.  uy 
If  ir^l  or  |s*l  less  than  some  tolerance,  then  exit  with 
invariant  subspace. 


Check  residual  bounds  for  converged  eigentriples. 
Action  1:  (Perform  look-ahead  to  determine  pivot  size) 


ivot  size, 

a:  (Complete  Ri  *  [r^r^]  and  S*  *  by  ge 

?4+1  and  from  r£  and  s*  (see  section  2.9)) 

Vi  -  Bri  -  Vizi 

b:  (Compute  needed  inner  product  matrices) 


generating 


(Six  inner  products  are  needed  since  the  (1,1)  element 
for  each  matrix  comes  from  step  i-1). 


c:  1.  (Compute  cosines  of  Important  angles) 

4>1  (■  pjq^)  *  c^/ (Ir^lls*!) 

$2  (■  m1n  max  {Ip^jlMPI+i\+iI})  x  0 

(set  for  W.j  singular) 
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If  W.j  cannot  be  factored  skip  to  action  2. 

Factor  W-  into  VT*U-1  and  into  6^. 

(This  version  uses  p*q  *  1) 

2.  (For  norms  of  prospective  bases  vectors) 

Jq^l  —  (e*U*X.U.ei)1/2 

Wwl  -  <e2u;xiu1ej)1/2 

IPJI  —  (e*V*¥1V1e1 )1/Z 

'n+i'  *-  <e2vtvivie251/2 

If  any  of  the  above  norms  is  less  than  some  tolerance, 
then  skip  to  action  2. 

3.  (Form  angles  between  prospective  bases  vectors) 

♦,  <-cos  Z.(pJ,q4)  =  l/(|pJ|.|qAl) 

^2-cos  Z.(j5*,ty  =  l/dp^l.lq^l) 

4.  (Get  the  minimum  angle  for  comparison) 

<j>2  «—  min{  | ,  |ij>2| } 

Action  2:  (Test  for  failure) 

If  1^1  and  <j>2  are  too  small,  exit  with  error. 

(The  look-ahead  process  with  the  2x2  pivot  is  not  guaranteed 
to  work  in  all  cases  (see  Chapter  IV)  and  the  only  reasonable 
response  is  to  flag  these  cases  and  exit.) 

Action  3:  (Select  bases) 

If  >  (some  bias)  *  <|>2  then  take  a  single  step  (1  xi 
pivot),  otherwise,  take  a  double  step  (2x2  pivot). 


p*  *-  ri/6i 

pi  *- 


Single  step  Double  step 

(Form  and  ) 

Bi  ^  »»**  Bi  -  <vTsT^k2 

(Note  that  both  B.  and  r.  are  rank  1  matrices) 

(Form  the  new  residuals) 

r£+l  rJl+l8Jl1  ri+2  (BQi‘-Qi_1ri  >x 


ri,+lBJl 


S£+l  "  Yi+1SA+1  *1+2  y*(PiB“BiPi«i ) 


(x  and  y*  2  element  vectors) 


Ai  *~al  =  elWie2/a)k 


(Form  ) 


“k  Ai  PiBQi 

*  UiCWie2»s*Brk+l^Vi 

(Orthogonal ize) 


r£+l  rA+l  "  qiai  rl+2  rl+2  ~  ^iAix 

s A+l  ^  Si+1  -  Vi  Sl+Z  *i+l  ’  y*Ai  pi 

(Form  inner  products  for  next  step) 


<(-Vi)x1('?l»1/2eI1 

lrW  (rU2rA+2 

,S£+2*  ^SH+2S2,+2 

det(Wi )u>"2 

“t+2  Sl+2r  1+2 

9.  (Set  zi+1  and  z*+1) 

zf+i  -  (1)  zi+1  =  (V*e2)/(y*V*e2) 

2f+1  -  (1)  z*+1  »  (eju^/te^x) 

end  of  step  i 

NOTES 

The  above  algorithm  assumes  that  pjq^  *  1  rather  than  lp£l  *  1, 
lq0l  ■  1.  Further,  no  assumption  is  made  about  the  actual  factorization 
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of  o)^  or  W... 

The  vectors  x  and  y*  (action  3c)  allow  flexibility  in  specify- 

.  2+2 
ing  the  components  of  the  new  residuals  strictly  within  K  and 

K*+2,  respectively. 

The  bias  factor  in  action  3  is  a  programming  device  which  permits 
the  Look-Ahead  Lanczos  to  implement  standard  Lanczos  (bias  =  0)  or  a 
sequence  of  double  steps  (2x2  pivots,  bias  =  “). 

The  look-ahead  process  modifies  the  two-sided  Lanczos  algorithm 
to  the  extent  that  the  next  residuals  are  already  being  formed  before 
the  bases  vectors  of  the  previous  step  are  set  (action  la).  Further, 
if  the  1  xi  pivot  is  used  (two-sided  Lanczos)  the  norms  of  residual 
vectors  and  the  1  xi  pivot  are  calculated  without  more  vector  inner 
products  (action  3f). 

Finally,  note  that  the  relevant  cosines  (see  section  2.10, 

Chapter  II)  are  calculated  without  calculating  the  bases  vectors 
involved  (actions  lb  and  1c). 
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